Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8265448: (zipfs): Reduce read contention in ZipFileSystem #3853

Closed
wants to merge 5 commits into from

Conversation

retronym
Copy link

@retronym retronym commented May 4, 2021

If the given Path represents a file, use the overload of read defined
in FileChannel that accepts an explicit position and avoid serializing
reads.

Note: The underlying NIO implementation is not required to implement
FileChannel.read(ByteBuffer, long) concurrently; Windows still appears
to lock, as it returns true for NativeDispatcher.needsPositionLock.

On MacOS X, the enclosed benchmark improves from:

Benchmark                    Mode  Cnt   Score   Error  Units
ZipFileSystemBenchmark.read  avgt   10  75.311 ? 3.301  ms/op

To:

Benchmark                    Mode  Cnt   Score   Error  Units
ZipFileSystemBenchmark.read  avgt   10  12.520 ? 0.875  ms/op

Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8265448: (zipfs): Reduce read contention in ZipFileSystem

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/3853/head:pull/3853
$ git checkout pull/3853

Update a local copy of the PR:
$ git checkout pull/3853
$ git pull https://git.openjdk.java.net/jdk pull/3853/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 3853

View PR using the GUI difftool:
$ git pr show -t 3853

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/3853.diff

If the given Path represents a file, use the overload of read defined
in FileChannel that accepts an explicit position and avoid serializing
reads.

Note: The underlying NIO implementation is not required to implement
FileChannel.read(ByteBuffer, long) concurrently; Windows still appears
to lock, as it returns true for NativeDispatcher.needsPositionLock.
@bridgekeeper
Copy link

bridgekeeper bot commented May 4, 2021

👋 Welcome back jzaugg! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label May 4, 2021
@retronym retronym changed the title 8265448: Avoid lock contention in reads from zipfs when possible 8265448: (zipfs): Reduce read contention in ZipFileSystem May 4, 2021
@openjdk
Copy link

openjdk bot commented May 4, 2021

@retronym The following labels will be automatically applied to this pull request:

  • core-libs
  • nio

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added nio nio-dev@openjdk.org core-libs core-libs-dev@openjdk.org labels May 4, 2021
@mlbridge
Copy link

mlbridge bot commented May 4, 2021

Webrevs

} else {
synchronized (zfch) {
n = zfch.position(pos).read(bb);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LanceAndersen Are you planning to look at this? Do you mind checking the async close case to make sure that the synchronization isn't masking anything?

Also just to point out that pattern matching for instanceof ca be used here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I plan to look at this. It would also be good to have a couple of additional reviews as well :-)

Copy link
Contributor

@AlanBateman AlanBateman May 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using the positional read on the underlying FileChannel is okay. I'm puzzled by the previous code as I would have expected it to restore the position (make me wonder if there are zipfs tests for this).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reading of the existing code is that the only position-influenced method called on the channel (either via ZipFileSystem.ch or ZipFileSystem$EntryInputStream.zfch) is read, and this is only called in the .position(pos).read(...) idiom. The failure to reset the position doesn't affect correctness. However the synchronzized is definitely needed to avoid races.

Incidentally, regarding this comment:

private class EntryInputStream extends InputStream {
    private final SeekableByteChannel zfch; // local ref to zipfs's "ch". zipfs.ch might
                                            // point to a new channel after sync()

If the file system is writable and updated, the underlying file is deleted and replaced with a temporary file by close() / sync(), but ZipFileSystem.ch is itself final since d581e4f. I believe the comment is outdated and EntryInputStream could just access ch via the outer pointer. That change would simplify this patch marginally.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the simplifying commit for now, but I'm happy to split that to a separate change if you prefer.

} else {
synchronized(ch) {
return ch.position(pos).read(bb);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's okay to include the update to EntryInputStream, that part looks fine, as does the directly use of the FileChannel positional read.

I'm still mulling over the case where ch is not a FileChannel as I would expected it to capture the existing position and restore it after the read. I think this is the degenerative case when the zip file is located in a custom file system that doesn't support FileChannel. In that case, positional read has to be implemented on the most basic SeekableByteChannel. It would only be observed when mixing positional read ops with other ops that depend on the current position.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are all the references to ch.

this.ch = Files.newByteChannel(zfpath, READ);
...
this.ch.close();
...
ch.close();              // close the ch just in case no update
...
if (ch instanceof FileChannel fch) {
    return fch.read(bb, pos);
} else {
    synchronized(ch) {
        return ch.position(pos).read(bb);
    }
}
...
long ziplen = ch.size();
...
ch.close();

It appears the only position-dependent operation called read(ByteBuffer). This is performed together with the pos call within the synchronized(ch) lock.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have confirmed that the non-FileChannel code path is exercised by existing tests.

test/jdk/jdk/nio/zipfs/ZipFSTester.java includes a test that forms a file system based on a JAR that is itself an entry within another ZipFileSystem.

Sample stacks:

java.lang.Throwable: readFullyAt. ch.getClass=class jdk.nio.zipfs.ByteArrayChannel
	at jdk.zipfs/jdk.nio.zipfs.ZipFileSystem.readFullyAt(ZipFileSystem.java:1234)
	at jdk.zipfs/jdk.nio.zipfs.ZipFileSystem.readFullyAt(ZipFileSystem.java:1226)
	at jdk.zipfs/jdk.nio.zipfs.ZipFileSystem$EntryInputStream.initDataPos(ZipFileSystem.java:2259)
	at jdk.zipfs/jdk.nio.zipfs.ZipFileSystem$EntryInputStream.read(ZipFileSystem.java:2201)
	at jdk.zipfs/jdk.nio.zipfs.ZipFileSystem$2.fill(ZipFileSystem.java:2151)
	at java.base/java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
	at ZipFSTester.checkEqual(ZipFSTester.java:858)
	at ZipFSTester.test1(ZipFSTester.java:259)
java.lang.Throwable: readFullyAt. ch.getClass=class jdk.nio.zipfs.ByteArrayChannel
	at jdk.zipfs/jdk.nio.zipfs.ZipFileSystem.readFullyAt(ZipFileSystem.java:1234)
	at jdk.zipfs/jdk.nio.zipfs.ZipFileSystem$EntryInputStream.read(ZipFileSystem.java:2214)
	at jdk.zipfs/jdk.nio.zipfs.ZipFileSystem$2.fill(ZipFileSystem.java:2151)
	at java.base/java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
	at ZipFSTester.checkEqual(ZipFSTester.java:858)
	at ZipFSTester.test1(ZipFSTester.java:259)

This use case is not covered by the ZipFSTester.test2, a multi-threaded test.

While looking at the test I noticed false warnings in the output: read()/position() failed. This did not actually fail the test. I investigated this and a) fixed the condition to deal with the edge case of zero-length entries and b) throw an "check failed" exception when the assertion fails.

This appears to have been omitted when this test was added.
To avoid false error reports, the condition must deal with the
edge case of zero-length entries, for which read will return -1.
@LanceAndersen
Copy link
Contributor

Hi Jason,

I have made a pass through your proposed changes and they look OK. I am in the process of running our various Mach5 tiers against your patch to see if any unforeseen issues arise

Best
Lance

@openjdk
Copy link

openjdk bot commented May 9, 2021

@retronym This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8265448: (zipfs): Reduce read contention in ZipFileSystem

Reviewed-by: alanb, lancea

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 94 new commits pushed to the master branch:

  • 9713152: 8262092: vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t001/TestDescription.java SIGSEGV in memmove_ssse3
  • 23446f1: 8265128: [REDO] Optimize Vector API slice and unslice operations
  • e5d3ee3: 8266802: Shenandoah: Round up region size to page size unconditionally
  • 8851cb6: 8266779: Use instead of ZERO_WIDTH_SPACE
  • 0cc7833: 8265208: [JEP-356] : SplittableRandom and SplittableGenerators - splits() methods does not throw NullPointerException when source is null
  • f78440a: 8266440: Shenandoah: TestReferenceShortcutCycle.java test failed on AArch64
  • de78431: 8241502: C2: Migrate x86_64.ad to MacroAssembler
  • c8b7447: 8266603: jpackage: Add missing copyright file in Java runtime .deb installers
  • c494efc: 8266774: System property values for stdout/err on Windows UTF-8
  • 25d99e5: 8266330: itableMethodEntry::initialize() asserts with archived old classes
  • ... and 84 more: https://git.openjdk.java.net/jdk/compare/ee5bba0dc4cc7c2bfe633c5a3fe731c6c37adb1d...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@AlanBateman, @LanceAndersen) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the ready Pull request is ready to be integrated label May 9, 2021
Copy link
Contributor

@LanceAndersen LanceAndersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mach5 jdk-tier1, jdk-tier, jdk-tier3 completed successfully

@retronym
Copy link
Author

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label May 10, 2021
@openjdk
Copy link

openjdk bot commented May 10, 2021

@retronym
Your change (at version 9106096) is now ready to be sponsored by a Committer.

@LanceAndersen
Copy link
Contributor

/sponsor

@openjdk openjdk bot closed this May 11, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed sponsor Pull request is ready to be sponsored ready Pull request is ready to be integrated rfr Pull request is ready for review labels May 11, 2021
@openjdk
Copy link

openjdk bot commented May 11, 2021

@LanceAndersen @retronym Since your change was applied there have been 109 commits pushed to the master branch:

  • acf02ed: 8208237: Re-examine defmeth tests and update as needed
  • ac0287f: 8266770: Clean pending exception before running dynamic CDS dump
  • 7a0a57c: 8266820: micro java/nio/SelectorWakeup.java has wrong copyright header
  • d0daa72: 8266857: PipedOutputStream.sink should be volatile
  • 381de0c: 8266753: jdk/test/lib/process/ProcTest.java failed with "Exception: Proc abnormal end"
  • 2d2cd78: 8266761: AssertionError in sun.net.httpserver.ServerImpl.responseCompleted
  • 9c9c47e: 8266813: Shenandoah: Use shorter instruction sequence for checking if marking in progress
  • 0344e75: 8266794: Remove dead code notify_allocation_jvmti_allocation_event
  • 9e6e222: 8266892: avoid maybe-uninitialized gcc warnings on linux s390x
  • 6575566: 8266787: Potential overflow of pointer arithmetic in G1ArchiveAllocator
  • ... and 99 more: https://git.openjdk.java.net/jdk/compare/ee5bba0dc4cc7c2bfe633c5a3fe731c6c37adb1d...master

Your commit was automatically rebased without conflicts.

Pushed as commit 0a12605.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs core-libs-dev@openjdk.org integrated Pull request has been integrated nio nio-dev@openjdk.org
Development

Successfully merging this pull request may close these issues.

3 participants