-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8282648: Weaken the InflaterInputStream specification in order to allow faster Zip implementations #7986
8282648: Weaken the InflaterInputStream specification in order to allow faster Zip implementations #7986
Conversation
/csr |
👋 Welcome back simonis! A progress list of the required criteria for merging this PR into |
@simonis has indicated that a compatibility and specification (CSR) request is needed for this pull request. |
Webrevs
|
Hi Volker, I believe your PR should point to the JBS issue in the title, which references the CSR and not the CSR directly in the title. |
128166f
to
b55fc33
Compare
Sorry, you're right of course :) |
src/java.base/share/classes/java/util/zip/InflaterInputStream.java
Outdated
Show resolved
Hide resolved
Hello Volker,
As we see above, none of these APIs talk about |
You are right with your observation and I'll be happy to add a corresponding comment if @LanceAndersen and @AlanBateman agree. Please let me know what you think? |
One other way to communicate changes is in the release-note. I added release-note=yes to the bug. |
Hi Volker, I believe Jai raises a valid point given these javadocs probably have had limited updates if any since the API was originally added. We should look at ZipInputStream and GZipInputStream as well if we decide to update the ZipFile::getInputStream(where we could borrow some wording from the ZipInputStream class description as a start to some word smithing). As Roger points out we will need a release note for this change as well. |
A suggestion for the structure is to start the new paragraph by saying it reads n bytes of uncompressed data into b[off] to b[off+n-1]. Then follow to say that the contents of b[n] to b[off+len-1] after the read are are undefined, then say that an implementation is free to ... Finally just finish it by saying that the contents are also undefined when the method fails by throwing an exception. |
Hi, I've pushed a new version which:
Please let me know what you think? |
src/java.base/share/classes/java/util/zip/InflaterInputStream.java
Outdated
Show resolved
Hide resolved
The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Volker,
I think this reads much better. Its too bad we cannot take advantage of @inheritedDoc
A couple of minor comments below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've addressed my points, the updated javadoc looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Volker,
The updates look good. Thank you. Next up is updating the CSR with the changes and craft the release-note as a subtask.
Thank you for your efforts here
Thanks @AlanBateman, @LanceAndersen. I've just updated the CSR with the latest wording from this PR. Please feel free to review the CSR as well so I can finalize it. |
…n note on ZipFile::getInputStream and aligned wording for all ::read methods
d62cba4
to
62e25d4
Compare
The latest push contains the following changes:
|
@simonis Please do not rebase or force-push to an active PR as it invalidates existing review comments. All changes will be squashed into a single commit automatically when integrating. See OpenJDK Developers’ Guide for more information. |
@simonis This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the CSR once you’ve finalized the specification changes.
src/java.base/share/classes/java/util/zip/InflaterInputStream.java
Outdated
Show resolved
Hide resolved
src/java.base/share/classes/java/util/zip/InflaterInputStream.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Volker,
Thank you for your patience while we worked through all of the nuances of this
The changes look good to me
@mbreinhold, @LanceAndersen thanks for your reviews. I'm waiting for the CSR approval before pushing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The update in d82c752 looks good.
@simonis This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 422 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo cleanup looks fine
/integrate |
Going to push as commit 2c61efe.
Your commit was automatically rebased without conflicts. |
Add an API note to
InflaterInputStream::read(byte[] b, int off, int len)
to highlight that it might write more bytes than the returned number of inflated bytes into the bufferb
.The superclass
java.io.InputStream
specifies thatread(byte[] b, int off, int len)
will leave the content beyond the last read byte in the read bufferb
unaffected. However, the overriddenread
method inInflaterInputStream
passes the read bufferb
toInflater::inflate(byte[] b, int off, int len)
which doesn't provide this guarantee. Depending on implementation details,Inflater::inflate
might write more than the returned number of inflated bytes into the bufferb
.TL;DR
java.util.zip.Inflater
is the Java wrapper class for zlib's inflater functionality.Inflater::inflate(byte[] output, int off, int len)
currently calls zlib's nativeinflate(..)
function and passes the address ofoutput[off]
andlen
to it via JNI.The specification of zlib's
inflate(..)
function (i.e. the API documentation in the original zlib implementation) doesn't give any guarantees with regard to usage of the output buffer. It only states that upon completion the function will return the number of bytes that have been written (i.e. "inflated") into the output buffer.The original zlib implementation only wrote as many bytes into the output buffer as it inflated. However, this is not a hard requirement and newer, more performant implementations of the zlib library like zlib-chromium or zlib-cloudflare can use more bytes of the output buffer than they actually inflate as a scratch buffer. See https://github.com/simonis/zlib-chromium for a more detailed description of their approach and its performance benefit.
These new zlib versions can still be used transparently from Java (e.g. by putting them into the
LD_LIBRARY_PATH
or by usingLD_PRELOAD
), because they still fully comply to specification ofInflater::inflate(..)
. However, we might run into problems when using theInflater
functionality from theInflaterInputStream
class.InflaterInputStream
is derived from fromInputStream
and as such, itsread(byte[] b, int off, int len)
method is quite constrained. It specifically specifies that if k bytes have been read, then "these bytes will be stored in elementsb[off]
throughb[off+
k-1]
, leaving elementsb[off+
k]
throughb[off+len-1]
unaffected". ButInflaterInputStream::read(byte[] b, int off, int len)
(which is constrained byInputStream::read(..)
's specification) callsInflater::inflate(byte[] b, int off, int len)
and directly passes its output buffer down to the native zlibinflate(..)
method which is free to change the bytes beyondb[off+
k]
(where k is the number of inflated bytes).From a practical point of view, I don't see this as a big problem, because callers of
InflaterInputStream::read(byte[] b, int off, int len)
can never know how many bytes will be written into the output bufferb
(and in fact its content can always be completely overwritten). It therefore makes no sense to depend on any data there being untouched after the call. Also, having used zlib-cloudflare productively for about two years, we haven't seen real-world issues because of this behavior yet. However, from a specification point of view it is easy to artificially construct a program which violatesInflaterInputStream::read(..)
's postcondition if using one of the alterantive zlib implementations. A recently integrated JTreg test (test/jdk/jdk/nio/zipfs/ZipFSOutputStreamTest.java) "unintentionally" fails with zlib-chromium but can fixed easily to run with alternative implementations as well (see JDK-8283756).Progress
Issues
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/7986/head:pull/7986
$ git checkout pull/7986
Update a local copy of the PR:
$ git checkout pull/7986
$ git pull https://git.openjdk.org/jdk pull/7986/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 7986
View PR using the GUI difftool:
$ git pr show -t 7986
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/7986.diff