8350880: (zipfs) Add support for read-only zip file systems#25178
8350880: (zipfs) Add support for read-only zip file systems#25178david-beaumont wants to merge 15 commits intoopenjdk:masterfrom
Conversation
|
👋 Welcome back david-beaumont! A progress list of the required criteria for merging this PR into |
|
@david-beaumont This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 134 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@LanceAndersen, @jaikiran, @AlanBateman) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
|
@david-beaumont The following labels will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
david-beaumont
left a comment
There was a problem hiding this comment.
Preloading some explanatory comment.
| .startsWith("Windows"); | ||
| private static final byte[] ROOTPATH = new byte[] { '/' }; | ||
| .startsWith("Windows"); | ||
| private static final byte[] ROOTPATH = new byte[]{'/'}; |
There was a problem hiding this comment.
Whitespace format changes from my editor - will revert (sorry hadn't spotted).
| private static final String PROPERTY_ACCESS_MODE = "accessMode"; | ||
|
|
||
| // Posix file permissions allow per-file access control in a posix-like fashion. | ||
| // Note that using a "readOnly" access mode will force all files, and the |
There was a problem hiding this comment.
Not correct now - will fix.
| this.defaultOwner = supportPosix ? initOwner(zfpath, env) : null; | ||
| this.defaultGroup = supportPosix ? initGroup(zfpath, env) : null; | ||
| this.defaultPermissions = supportPosix ? initPermissions(env) : null; | ||
| this.defaultPermissions = supportPosix ? Collections.unmodifiableSet(initPermissions(env)) : null; |
There was a problem hiding this comment.
Ensuring this is unmodifiable to avoid risk of returning a modifiable enum set.
| zfpath.getFileSystem().provider().checkAccess(zfpath, AccessMode.READ); | ||
| boolean writeable = Files.isWritable(zfpath); | ||
| this.readOnly = !writeable; | ||
| zfpath.getFileSystem().provider().checkAccess(zfpath, java.nio.file.AccessMode.READ); |
There was a problem hiding this comment.
Type name clash with existing AccessMode class. Since new enum is private, happy to rename.
There was a problem hiding this comment.
So perhaps change the name of the newly added enum
| // It requires 'entryLookup' and 'readOnly' to have safe defaults (which | ||
| // is why they are the only non-final fields), and it requires that the | ||
| // inode map has been initialized. | ||
| Optional<Integer> multiReleaseVersion = determineReleaseVersion(env); |
There was a problem hiding this comment.
Now release version and entry lookup determination always happen on a read-only file system.
| Object o = env.get(PROPERTY_DEFAULT_PERMISSIONS); | ||
| if (o == null) { | ||
| return DEFAULT_PERMISSIONS; | ||
| return PosixFilePermissions.fromString("rwxrwxrwx"); |
There was a problem hiding this comment.
Since DEFAULT_PERMISSIONS was mutable and only used exactly once, it seems better to just inline it where it's clear what it actually is based off. Better readability.
| * use its value to determine the requested version. If not use the value of the "multi-release" property. | ||
| */ | ||
| private void initializeReleaseVersion(Map<String, ?> env) throws IOException { | ||
| private Optional<Integer> determineReleaseVersion(Map<String, ?> env) throws IOException { |
There was a problem hiding this comment.
Refactored to return the value rather than side-effect instance fields.
| * be opened if the underlying file is missing. | ||
| */ | ||
| @Test | ||
| public void testNoSuchFileFailure() { |
There was a problem hiding this comment.
In theory some of these could be split more, but they are uncomplicated sanity checks.
test/jdk/jdk/nio/zipfs/Utils.java
Outdated
| */ | ||
| static Path createJarFile(String name, String... entries) throws IOException { | ||
| Path jarFile = Paths.get("basic.jar"); | ||
| Path jarFile = Paths.get(name); |
There was a problem hiding this comment.
Previous tests always had the test file called the same thing.
Note that in jtreg tests, the file of the same name often already exists in the test directory.
I tried adding a "ensure file doesn't already exist" check and it fails lots of tests.
There was a problem hiding this comment.
It's interesting that we have several places in the test which use this Utils.createJarFile() and pass it a test specific JAR file name and yet this utility was ignoring the passed name hard coding the name to basic.jar. It appears to be an oversight and your fix here I believe is a good thing.
I tried adding a "ensure file doesn't already exist" check and it fails lots of tests.
I think it's OK in its current form to allow the file to be overwritten. Plus, you have also updated the javadoc of this test utility method to explicitly state its current behaviour, which I think is enough to let the call sites know how this behaves.
|
Obvious mistakes now fixed (it's amazing what you only spot when looking at code in a review UI). |
| * <td>null/unset</td> | ||
| * <td> | ||
| * A value defining the desired read/write access mode of the file system | ||
| * (either <em>read-write</em> or <em>read-only</em>). |
There was a problem hiding this comment.
This line is slightly confusing. Initially I thought read-write and read-only are the actual values that this environment property will accept. But further review suggests that the actual literals supported are readOnly and readWrite and this line here I think is trying to explain the supported semantics of the file system.
Perhaps we could reword this to something like:
A value defining the desired access mode of the file system. A ZIP file system can be created to allow for read-write access or read-only access.
There was a problem hiding this comment.
Yeah, I tried to distinguish the accessMode flags (for which there are 3 states, ro, rw, n/a) and the final state of the file system from fs.isReadOnly(). Hence using <em>...</em> for whenever the actually resulting state is mentioned. I guess it wasn't clear enough.
There was a problem hiding this comment.
Thank you for updating this part, this is much more clearer now.
| case null -> { | ||
| return null; | ||
| } | ||
| case String label when READ_WRITE.label.equals(label) -> { |
There was a problem hiding this comment.
I haven't yet fully caught up on the newer features of switch/case. Does this have any advantage as compared to the simpler:
case "readWrite" -> {
return READ_WRITE;
}
case "readOnly" -> {
return READ_ONLY;
}There was a problem hiding this comment.
Or just an if :)
The reason for the new style was the "need" for the "String label when" qualifier, which I don't actually need because String.equals(xx) copes with being given non-strings fine. You can't use the syntax you suggested in the new switch since "value" isn't a String.
There was a problem hiding this comment.
Thank you for updating this to a traditional and trivial if/else.
| } | ||
|
|
||
| // Parses the file system permission from an environmental parameter. While | ||
| // the FileSystemAccessMode is private, we don't need to check if it was |
There was a problem hiding this comment.
The second sentence perhaps is a leftover? I don't see any FileSystemAccessMode type. I think it would be better to remove that second sentence altogether. The rest of the comment looks good and clearly states what this method does.
LanceAndersen
left a comment
There was a problem hiding this comment.
Thank. you David for your work here
A few comments on a quick pass in addition to those from Alan & Jai
The copyright will also need to be updated
| private static final String PROPERTY_MULTI_RELEASE = "multi-release"; | ||
|
|
||
| private static final Set<PosixFilePermission> DEFAULT_PERMISSIONS = | ||
| PosixFilePermissions.fromString("rwxrwxrwx"); |
There was a problem hiding this comment.
I am not convinced this needs to be changed as it becomes a personal style preference
There was a problem hiding this comment.
Possibly not, but it is only used in one place, and anyone reading the code now has the comment saying "If not specified in env, it will return 777." three lines away from the code that uses "rwxrwxrwx" instead of nearly 300.
Plus, there isn't now a mutable static final constant which could be accidentally leaked and mutated in the future.
Yes, it's personal preference to a degree, but I feel this is a small, but non-zero, improvement in maintainability, and I would ask someone to do this if I were reviewing a change which added this to the code.
| // create zip file using zipfs with default options | ||
| createTestZipFile(ZIP_FILE, ENV_DEFAULT).close(); | ||
| // check entries on zipfs with default options | ||
| try (FileSystem zip = FileSystems.newFileSystem(ZIP_FILE, ENV_DEFAULT)) { |
There was a problem hiding this comment.
This tests an empty Map and should still be a test as it is different from ENV_READ_ONLY
There was a problem hiding this comment.
Revert and added new ReadOnly variants of those tests. Thanks for calling me out on this, it was a bit sloppy of me.
| * </li> | ||
| * <li> | ||
| * Any other values will cause an {@code IllegalArgumentException} | ||
| * to be thrown. |
There was a problem hiding this comment.
The wording looks great. Just one thing with the "causes an IAE to be thrown" where I think it can be expanded to say that it causes IAE to be thrown when attempting to create the ZIP file system. The existing compressMethod property has word for this too.
There was a problem hiding this comment.
The IAE message would be best to standardize around the wording that the compressMethod property uses (same for releaseVersion while we are at it.
| private final ZipPath rootdir; | ||
| private boolean readOnly; // readonly file system, false by default | ||
| // Starts in readOnly (safe mode), but might be reset at the end of initialization. | ||
| private boolean readOnly = true; |
There was a problem hiding this comment.
If readOnly gets used by some code when a ZipFileSystem instance is being constructed (which is why I believe this can't be a final field), then I think we should not change this value to true. In other words, would this change now have a chance of introducing a ReadOnlyFileSystemException when constructing the ZipFileSystem whereas before it wouldn't?
There was a problem hiding this comment.
""would this change now have a chance of introducing a ReadOnlyFileSystemException when constructing the ZipFileSystem whereas before it wouldn't?""
If there was ever a "ReadOnlyFileSystemException" when initializing the ZipFileSystem, we have a very serious bug already. Since we already have the opening of Zip working for cases where the underlying file is read-only in the file system, it has to be true that no attempt is made to modify the file contents.
While I accept that this new code now calls out to functions with this flag in a different state to before, I believe this is an absolute improvement since:
-
It is not acceptable to say to users "we support a read-only mode" if during initialization we might modify the file in any way (even including changing last-modification times etc.).
-
All the evidence points to whichever operations are being done during init as being fundamentally "read-only" (both due to it working on read-only zip files, and there being no conceptual need to modify anything during setup).
I'll happily do a thorough audit of everything which could be affected by this change if that will give you confidence, but I would not want to change this code back to its default "read-write until stated otherwise" behaviour.
There was a problem hiding this comment.
Hello David, I had another look at this code and after going through it, it looked like readOnly field can in fact be made final because of the refactoring changes that you did in this PR. I checked out your latest PR locally and here's the additional change I did:
diff --git a/src/jdk.zipfs/share/classes/jdk/nio/zipfs/ZipFileSystem.java b/src/jdk.zipfs/share/classes/jdk/nio/zipfs/ZipFileSystem.java
index e2fddd96fe8..f54b5360ac5 100644
--- a/src/jdk.zipfs/share/classes/jdk/nio/zipfs/ZipFileSystem.java
+++ b/src/jdk.zipfs/share/classes/jdk/nio/zipfs/ZipFileSystem.java
@@ -110,8 +110,7 @@ class ZipFileSystem extends FileSystem {
private final Path zfpath;
final ZipCoder zc;
private final ZipPath rootdir;
- // Starts in readOnly (safe mode), but might be reset at the end of initialization.
- private boolean readOnly = true;
+ private final boolean readOnly;
// default time stamp for pseudo entries
private final long zfsDefaultTimeStamp = System.currentTimeMillis();
@@ -227,11 +226,6 @@ static ZipAccessMode from(Object value) {
// Determining a release version uses 'this' instance to read paths etc.
Optional<Integer> multiReleaseVersion = determineReleaseVersion(env);
-
- // Set the version-based lookup function for multi-release JARs.
- this.entryLookup =
- multiReleaseVersion.map(this::createVersionedLinks).orElse(Function.identity());
-
// We only allow read-write zip/jar files if they are not multi-release
// JARs and the underlying file is writable.
this.readOnly = forceReadOnly || multiReleaseVersion.isPresent() || !Files.isWritable(zfpath);
@@ -241,6 +235,10 @@ static ZipAccessMode from(Object value) {
: "the ZIP file is not writable";
throw new IOException(reason);
}
+ // Set the version-based lookup function for multi-release JARs.
+ this.entryLookup =
+ multiReleaseVersion.map(this::createVersionedLinks).orElse(Function.identity());
+
}
/**
With the refactoring changes you have done so far, we are now able to determine the ultimate value for readOnly before anything in the construction of ZipFileSystem would need access to it. Does this additional change look reasonable to you? I haven't run any tests against this change.
Making it final and having it not accessed until the ultimate value is set in the constructor would get us past any of the concerns we may have about the "initial" value of readOnly.
There was a problem hiding this comment.
On second thought, may be this isn't enough. I see that I missed the fact that determineReleaseVersion() requires access to this. Please leave this in the current form that you have in your PR. I need a few more moments to consider if anything needs to be changed.
There was a problem hiding this comment.
final members are good! Alternatively if making it final is not an option you could consider replacing:
private boolean readOnly;
with
private boolean readWrite;
I looked at where readOnly was used and there are not that many places, so inverting the flag might not be too intrusive.
| // Determining a release version uses 'this' instance to read paths etc. | ||
| // It requires 'entryLookup' and 'readOnly' to have safe defaults (which | ||
| // is why they are the only non-final fields), and it requires that the | ||
| // inode map has been initialized. |
There was a problem hiding this comment.
It's good to note that determineReleaseVersion(...) (and createVersionedLinks(...)) access instance fields of the ZipFileSystem being constructed. I think the comment however could be brief and should leave out the details about safe defaults.
Perhaps something like:
determineReleaseVersion() and createVersionedLinks() access instance fields while 'this' ZipFileSystem instance is being constructed.
There was a problem hiding this comment.
Not sure I see a need for the last sentence regarding the inode map having to be initialized in addition to Jai's comments above
There was a problem hiding this comment.
Fair enough, removed. I err on the side of over explaining things for future maintainers.
| // JARs and the underlying file is writable. | ||
| this.readOnly = forceReadOnly || multiReleaseVersion.isPresent() || !Files.isWritable(zfpath); | ||
| if (readOnly && accessMode == AccessMode.READ_WRITE) { | ||
| String reason = Files.isWritable(zfpath) |
There was a problem hiding this comment.
Nit - this additional call to Files.isWritable(...) can be avoided if we store the value of the previous call (a couple of lines above). I realize that the previous Files.isWritable is stashed at the end of the || conditionals to prevent it from being invoked in certain situations.
So maybe a better change would be something like:
String reason = multiReleaseVersion.isPresent()
? "A multi-release JAR file opened with a specified version is not writable"
: "The underlying ZIP file is not writable";which would then avoid any additional calls to Files.isWritable.
| ? "A multi-release JAR file opened with a specified version is not writable" | ||
| : "The underlying ZIP file is not writable"; | ||
| throw new IOException( | ||
| "A writable ZIP file system could not be opened for: " + zfpath + "\n" + reason); |
There was a problem hiding this comment.
The JDK coding guidelines recommend excluding file paths from exception messages. So in this case the zfpath should be left out from the exception message. Additionally, I haven't seen us using newline characters in exception messages that we construct in the JDK code. So I think we should leave that out too.
Given this, I think it might be simpler to just change this if block to something like:
if (...) {
String reason = multiReleaseVersion.isPresent()
? "the multi-release JAR file is not writable"
: "the ZIP file is not writable";
throw new IOException(reason);
}There was a problem hiding this comment.
Thanks for letting me know. Now I think about the "no filename" things is very sensible for core libraries.
Done.
| * {@linkplain Runtime.Version Java SE Platform version number}, | ||
| * an {@code IllegalArgumentException} will be thrown. | ||
| * an {@code IllegalArgumentException} will be thrown when the Zip | ||
| * filesystem is created. |
There was a problem hiding this comment.
Minor comment, I think you want "when creating the ZIP file system" here.
| * already exist. | ||
| * Specifying {@code create} as {@code true} and {@code accessMode} as | ||
| * {@code readOnly} will cause an {@code IllegalArgumentException} | ||
| * to be thrown when creating the ZIP file system. |
There was a problem hiding this comment.
I 'think' this should be "cause a" or "result in a" for the usages of "cause an" based on how the phrase is used
| * <em>read-only</em> access. | ||
| * <ul> | ||
| * <li> | ||
| * If no value is set, the file system is created <em>read-write</em> |
There was a problem hiding this comment.
created -> created as
I think it might be better as
If no value is specified, the file system is created as ....
| * isReadOnly()} will always return {@code true}. Creating a | ||
| * <em>read-only</em> file system requires the underlying ZIP file to | ||
| * already exist. | ||
| * Specifying {@code create} as {@code true} and {@code accessMode} as |
There was a problem hiding this comment.
Specifying {@code create}
You should make this clear that it is the 'create property '
jaikiran
left a comment
There was a problem hiding this comment.
This looks good to me. I would have prefered not changing the intial value of a the readOnly field to true, because that would mean one less thing to be concerned about. However, I am not too sure if it makes a practical difference. So what you have I think is OK. I'll think about it a bit more later and if anything shows up, I think we can address it.
I think this enhancement deserves a release note, so I have added that label to the JBS issue. Instructions for creating a release note are available here https://openjdk.org/guide/#release-notes.
Thank you David for being patient with the back and forth discussions and reviews. Please wait for Alan and Lance to have a chance to look at the final state of this PR before integrating.
|
Looks like I'm late to the party :-) My apologies if this has already been discussed a long time ago. My question is whether it's really necessary to restrict the 'read-only' flag to existing ZIP files only. I don't see any issue with allowing the creation of a new, empty ZIP file and marking it as read-only. Sure, it might be useless since it's just an empty zip file system, but that's up to the user. Not a very strong opinion though. 283 * 284 * If the value is {@code "readOnly"}, the file system is created 285 * read-only, and {@link java.nio.file.FileSystem#isReadOnly() 286 * isReadOnly()} will always return {@code true}. Creating a 287 * read-only file system requires the underlying ZIP file to 288 * already exist. 289 * Specifying the {@code create} property as {@code true} with the 290 * {@code accessMode} as {@code readOnly} will cause an {@code 291 * IllegalArgumentException} to be thrown when creating the ZIP file 292 * system. 293 * |
In the file system API, the "CREATE" option is specified to be ignored when opening a file for only reading. This means an I/O exception if the file doesn't exist. This was a pragmatic choice at the time that requires the full table of combinations and their outcome to see. For zipfs, the doing the same, or rejecting having the env contain both create=true and accessMode=readOnly, is okay. We chatted with David about this recently and agreed that rejecting this combination makes the most sense. It means someone has opt'ed it for both options, which is different to the file system API where it defaults to "READ" if none of the read/write/append options are provided. |
LanceAndersen
left a comment
There was a problem hiding this comment.
I think you are in good shape with your last set of changes David
AlanBateman
left a comment
There was a problem hiding this comment.
API docs + update to ZipFileSystem looks good. Test coverage looks good too.
|
/integrate |
|
@david-beaumont |
|
tier1, tier2 and tier3 testing of this change went fine. I'll go ahead and sponsor this now. |
|
/sponsor |
|
Going to push as commit 832c5b0.
Your commit was automatically rebased without conflicts. |
|
@jaikiran @david-beaumont Pushed as commit 832c5b0. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
Adding read-only support to ZipFileSystem.
The new
accessModeenvironment property allows for readOnly and readWrite values, and ensures that the requested mode is consistent with what's returned.This involved a little refactoring to ensure that "read only" state was set initially and only unset at the end of initialization if appropriate.
By making 2 methods return values (rather than silently set non-final fields as a side effect) it's now clear in what order fields are initialized and which are final (sadly there are still non-final fields, but only a split of this class into two types can fix that, since determining multi-jar support requires reading the file system).
Progress
Issues
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/25178/head:pull/25178$ git checkout pull/25178Update a local copy of the PR:
$ git checkout pull/25178$ git pull https://git.openjdk.org/jdk.git pull/25178/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 25178View PR using the GUI difftool:
$ git pr show -t 25178Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/25178.diff
Using Webrev
Link to Webrev Comment