New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JDK-6956385: URLConnection.getLastModified() leaks file handles for jar:file and file: URLs #12871
Conversation
Hi @jglick, welcome to this OpenJDK project and thanks for contributing! We do not recognize you as Contributor and need to ensure you have signed the Oracle Contributor Agreement (OCA). If you have not signed the OCA, please follow the instructions. Please fill in your GitHub username in the "Username" field of the application. Once you have signed the OCA, please let us know by writing If you already are an OpenJDK Author, Committer or Reviewer, please click here to open a new issue so that we can record that fact. Please use "Add GitHub user jglick" as summary for the issue. If you are contributing this work on behalf of your employer and your employer has signed the OCA, please let us know by writing |
src/java.base/share/classes/sun/net/www/protocol/jar/JarURLConnection.java
Outdated
Show resolved
Hide resolved
Co-authored-by: ExE Boss <3889017+ExE-Boss@users.noreply.github.com>
JarURLConnection
may fail to close its underlying FileURLConnection
@jglick This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
Thank you! Please allow for a few business days to verify that your employer has signed the OCA. Also, please note that pull requests that are pending an OCA check will not usually be evaluated, so your patience is appreciated! |
This comment was marked as outdated.
This comment was marked as outdated.
Webrevs
|
public boolean isConnected() { | ||
return connected; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it would be better to add a
void closeInputStream() { ... } // (or void close() { ... }?)
that would simply close the input stream if it's not null...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
close()
would be natural enough, and would encapsulate behavior a bit better.
Should it then actually implement Closeable
, permitting the implicit request in #12871 (comment) to be addressed? That would not technically be an API change, since the implementation class is not exported, though it could be a behavioral change in case anyone is checking instanceof Closeable
(or instanceof AutoCloseable
); and would be a first step along the way to JDK-8224095, which may or may not be appropriate for something labeled a bug fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good question. I we made FileURLConnection
Closeable
or AutoCloseable
, and called close()
in JarURLConnection
when the underlying connection is (Auto)Closeable
then I believe it could warrant a CSR. On the other hand IIRC the file:
protocol handler cannot be overridden - so I'm not sure whether we could actually have a custom underlying URLConnection
- that would need to be carefully evaluated.
Also, as you note, instance of FileURLConnection may be returned to user code - so the fact that the URLConnection implements Closeable
/Autocloseable
could be observed, and would also require a CSR.
That said - we could simply expose a close() method on FileURLConnection
without implementing Closeable
/Autocloseable
- and could examine that (implementing Closeable
/Autocloseable
) in a different PR - perhaps as a Enhencement request - rather than a bug fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the other hand IIRC the
file:
protocol handler cannot be overridden - so I'm not sure whether we could actually have a custom underlyingURLConnection
- that would need to be carefully evaluated.
Strike that. We could have of course a jar:http://...
URL so the underlying connection is not necessarily a file connection and therefore can come from a custom protocol handler. So a CSR would definitely be needed if we started calling close() on custom URLConnections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
Also I noticed that we cannot define void close() throws IOException
here either, because of
jdk/src/java.base/share/classes/sun/net/www/URLConnection.java
Lines 256 to 262 in 44218b1
/** | |
* Call this to close the connection and flush any remaining data. | |
* Overriders must remember to call super.close() | |
*/ | |
public void close() { | |
url = null; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curiously, this method appears to never be called; at least
diff --git src/java.base/share/classes/sun/net/www/URLConnection.java src/java.base/share/classes/sun/net/www/URLConnection.java
index b3af24c594b..f346cd8869b 100644
--- src/java.base/share/classes/sun/net/www/URLConnection.java
+++ src/java.base/share/classes/sun/net/www/URLConnection.java
@@ -253,14 +253,6 @@ public abstract class URLConnection extends java.net.URLConnection {
REMIND */ ;
}
- /**
- * Call this to close the connection and flush any remaining data.
- * Overriders must remember to call super.close()
- */
- public void close() {
- url = null;
- }
-
private static HashMap<String,Void> proxiedHosts = new HashMap<>();
public static synchronized void setProxiedHost(String host) {
does not cause
make images
to fail, though it seems that the JDK source base does not currently enforce use of @Override
. (I did not find any overrides from a text search, and if there were some without the annotation, they were apparently not calling super.close()
.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. If it's not called anywhere maybe we could investigate adding throws IOException
to the signature...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is also an option of course; would be binary compatible. Happy to take that approach if you think it would be clearer.
@@ -25,6 +25,7 @@ | |||
|
|||
package sun.net.www.protocol.jar; | |||
|
|||
import sun.net.www.protocol.file.FileURLConnection; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How tightly connected are the two protocol handlers today?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously not coupled explicitly, only by virtue of jar:file:…
being parsed out using the default URL protocol handler.
@@ -90,6 +90,14 @@ public void connect() throws IOException { | |||
} | |||
} | |||
|
|||
public synchronized void closeInputStream() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that synchronization matches
jdk/src/java.base/share/classes/sun/net/www/protocol/file/FileURLConnection.java
Lines 182 to 214 in 37b042c
public synchronized InputStream getInputStream() | |
throws IOException { | |
int iconHeight; | |
int iconWidth; | |
connect(); | |
if (is == null) { | |
if (isDirectory) { | |
FileNameMap map = java.net.URLConnection.getFileNameMap(); | |
StringBuilder sb = new StringBuilder(); | |
if (files == null) { | |
throw new FileNotFoundException(filename); | |
} | |
files.sort(Collator.getInstance()); | |
for (int i = 0 ; i < files.size() ; i++) { | |
String fileName = files.get(i); | |
sb.append(fileName); | |
sb.append("\n"); | |
} | |
// Put it into a (default) locale-specific byte-stream. | |
is = new ByteArrayInputStream(sb.toString().getBytes()); | |
} else { | |
throw new FileNotFoundException(filename); | |
} | |
} | |
return is; | |
} |
jdk/src/java.base/share/classes/sun/net/www/protocol/file/FileURLConnection.java
Lines 73 to 91 in 37b042c
public void connect() throws IOException { | |
if (!connected) { | |
try { | |
filename = file.toString(); | |
isDirectory = file.isDirectory(); | |
if (isDirectory) { | |
String[] fileList = file.list(); | |
if (fileList == null) | |
throw new FileNotFoundException(filename + " exists, but is not accessible"); | |
files = Arrays.<String>asList(fileList); | |
} else { | |
is = new BufferedInputStream(new FileInputStream(filename)); | |
} | |
} catch (IOException e) { | |
throw e; | |
} | |
connected = true; | |
} | |
} |
jdk/src/java.base/share/classes/sun/net/www/protocol/file/FileURLConnection.java
Lines 185 to 186 in 37b042c
int iconHeight; | |
int iconWidth; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - that seems reasonable. I agree the use of synchronization here is a bit dubious. URL / URLConnection and connection Handlers are quite old and unfortunately suffer from a bad case of technical debt - which is hard to remove.
src/java.base/share/classes/sun/net/www/protocol/file/FileURLConnection.java
Outdated
Show resolved
Hide resolved
…en after `closeInputStream` openjdk#12871 (comment)
The latest revision look reasonable to me. At a minimum please run all URL/URL Stream Handler tests. If you have no more changes planned I'll import the PR and give it a round of additional testing. |
Ran make test TEST=:jdk_net on Linux. Only failures are
Since git checkout master
make clean images test TEST=java/net/MulticastSocket/Promiscuous.java fails similarly, I presume these are unrelated, perhaps something about my network configuration. I have no more changes planned. |
I have run tier1, tier2, tier3, and several repeated runs of jdk_net in our CI and haven't observed any failures related to the proposed changes. As far as I am concerned, this looks good. I am going to approve, but it would good to have a second Reviewer in case I missed anything. |
/reviewers 2 |
@jglick This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 552 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@dfuch, @Michael-Mc-Mahon) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
jarFile.close(); | ||
} | ||
} finally { | ||
if (jarFileURLConnection instanceof FileURLConnection fileURLConnection) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I Wonder if closing all urlconnection types or the ones with a special interface instead of a specific type (which also introduces a dependency) would be the better way? what about other connections with even more temporary resources (like a temporary file copy?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See discussion at #12871 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not address the question if it needs to close more than the file: protocol.
(And it still feels like a layering violation, should maybe the metadata function initializeHeaders close its own stream - balanced with connect())?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we wanted to make JarURLConnection call AutoCloseable::close (or Closeable::close) when the underlying URLConnection is AutoCloseable
- that would be a change of behaviour that would be observable by custom URLConnection implementations provided by custom URL Stream Handler.
That would be a change of behavior requiring a CSR, and it might not be appropriate for backport, due to potential backward compatibility issue (well - that would be a matter to discuss in the CSR anyway).
That's why I would advise to not try to do this in this PR. If we wanted to do it - we should probably log an RFE for that. Also examine what it means for URL.openConnection()
to return an object that implements AutoCloseable
. A the very least, an @apiNote
would be needed in URL API documentation.
Unless this really solves some real hard issue (as opposed to theoretical issues) for which we have no solution today, I would be very prudent with making wider changes to this old, intricate, and hard to maintain piece of code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should maybe the metadata function initializeHeaders close its own stream
Another possibility is for FileURLConnection.getHeaderField(String)
and related methods to never open a FileInputStream
to begin with, amending
jdk/src/java.base/share/classes/sun/net/www/protocol/file/FileURLConnection.java
Lines 96 to 100 in 8474e69
try { | |
connect(); | |
exists = file.exists(); | |
} catch (IOException e) { | |
} |
jdk/src/java.base/share/classes/sun/net/www/URLConnection.java
Lines 104 to 131 in 8474e69
public String getHeaderField(String name) { | |
try { | |
getInputStream(); | |
} catch (Exception e) { | |
return null; | |
} | |
return properties == null ? null : properties.findValue(name); | |
} | |
Map<String, List<String>> headerFields; | |
@Override | |
public Map<String, List<String>> getHeaderFields() { | |
if (headerFields == null) { | |
try { | |
getInputStream(); | |
if (properties == null) { | |
headerFields = super.getHeaderFields(); | |
} else { | |
headerFields = properties.getHeaders(); | |
} | |
} catch (IOException e) { | |
return super.getHeaderFields(); | |
} | |
} | |
return headerFields; | |
} |
jdk/src/java.base/share/classes/sun/net/www/protocol/file/FileURLConnection.java
Lines 137 to 140 in 8474e69
public String getHeaderField(String name) { | |
initializeHeaders(); | |
return super.getHeaderField(name); | |
} |
super
. The legality of such a change in light of jdk/src/java.base/share/classes/java/net/URLConnection.java
Lines 57 to 66 in 8474e69
* In general, creating a connection to a URL is a multistep process: | |
* <ol> | |
* <li>The connection object is created by invoking the | |
* {@link URL#openConnection() openConnection} method on a URL. | |
* <li>The setup parameters and general request properties are manipulated. | |
* <li>The actual connection to the remote object is made, using the | |
* {@link #connect() connect} method. | |
* <li>The remote object becomes available. The header fields and the contents | |
* of the remote object can be accessed. | |
* </ol> |
jdk/src/java.base/share/classes/java/net/URLConnection.java
Lines 92 to 96 in 8474e69
* The following methods are used to access the header fields and | |
* the contents after the connection is made to the remote object: | |
* <ul> | |
* <li>{@code getContent} | |
* <li>{@code getHeaderField} |
connect()
is called. On the other hand jdk/src/java.base/share/classes/sun/net/www/protocol/jar/JarURLConnection.java
Lines 232 to 235 in 8474e69
public String getHeaderField(String name) { | |
return jarFileURLConnection.getHeaderField(name); | |
} | |
connect()
s if it needed to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
has this comment
Interesting. And yet that does not appear to be true in master
: getContentType
and getContentLength
will return whatever was in effect at the time initializeHeaders()
was first called, which may have been before any explicit call to connect()
, or hours after an explicit call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! It's true only if the first time connect is called, it's implicitly by initializeHeaders()
(sigh)...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Michael-Mc-Mahon what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello Jesse, Daniel, I had a look at the existing code in sun.net.www.protocol.file.FileURLConnection
its super class sun.net.www.URLConnection
and then its superclass java.net.URLConnection
.
The various header related methods in sun.net.www.URLConnection
first call the getInputStream()
(and ignore that stream completely) and then check the properties
field to get the header value. The call to getInputStream()
is an internal implementation detail (since the java.net.URLConnection
public APIs make no mention that it's required or will be called) and it seems to me that it's being called to make sure that the underlying resource is "connectable/available". The returned InputStream
itself never gets used.
For the FileURLConnection
class, the InputStream
plays no role to populate these header fields. Yet, before the changes in this PR, it does end up creating an InputStream
when trying to populate these header fields. I think the cleanest fix would be to not create this InputStream
when populating these header fields (i.e. from initializeHeaders()
method). That would mean that the initializeHeaders()
wouldn't need to call the connect()
method, like what's being proposed in this discussion thread. I think not calling connect()
from initializeHeaders()
(which gets called from various getHeader... methods) will still satisfy the specification because java.net.URLConnection#connect()
method states:
...
Operations that depend on being connected, like getContentLength, will implicitly perform the connection, if necessary.
The "if necessary" in that sentence I think allows us to not "connect()" when populating and returning header fields.
I think the only place we should call connect() internally in this FileURLConnection
is when getInputStream()
method gets called on this class.
P.S: The current implementation in initializeHeaders()
has some oddities. For example, if the File
was a directory then in connect()
we set isDirectory()
as true
. Then at some later point, when someone calls a getHeader... method and initializeHeaders()
gets called, if the directory was deleted from the filesystem, then initializeHeaders()
exists
will be false, so it will go into the if (!initializedHeaders || !exists) {
block but will still end up using the old outdated isDirectory
flag to populate the properties
. That's a different unrelated issue though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we should wait for @Michael-Mc-Mahon input before reworking these methods to not call connect()
. There seems to be an assumption that connect()
would/should be called.
What I like with the current fix is that it's a very narrow fix which is unlikely to cause too much backward compatibility issues.
…ntentEncoding` but better matches the bug title (suggested by @ecki openjdk#12871 (comment))
…onnectionLeak-JDK-6956385
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change looks fine to me. I had a slight concern about closing the underlying stream and then maybe trying to obtain header information from the JAR url connection itself. But, it looks like FileURLConnection captures whatever information is available at connect() time.
Hi @jglick sorry for the long time this took to review. I'm happy with the proposed change as they are. If you |
/summary |
/integrate |
(Trying to follow instructions in https://openjdk.org/jeps/369, if those are even still valid. https://github.com/openjdk/jdk/blob/master/CONTRIBUTING.md is not very helpful. I guess it should link to https://openjdk.org/guide/#life-of-a-pr which makes no mention of |
@jglick Setting summary to:
|
/sponsor |
Going to push as commit 9f98136.
Your commit was automatically rebased without conflicts. |
@Michael-Mc-Mahon @jglick Pushed as commit 9f98136. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
Hello Jesse,
Skara bot commands are listed in https://wiki.openjdk.org/display/SKARA/Pull+Request+Commands |
JDK-6956385:
JarURLConnection
properly tracks anyInputStream
it itself opened, and correspondingly closes theJarFile
if necessary (when caches are disabled). However if its underlyingFileURLConnection
was used to retrieve a header field, that would have caused aFileInputStream
to be opened which never gets closed until it is garbage collected. This means that an application which calls certain methods onjar:file:/…something.jar!/…
URLs will leak file handles, even ifURLConnection
caches are supposed to be turned off. This can delay release of system resources, and on Windows can prevent the JAR file from being deleted even after it is no longer in use (for example afterURLClassLoader.close
).JDK-8224095 was marked as a duplicate, but I think incorrectly. It refers to
FileURLConnection
, and seems to be complaining about the confusing API design ofURLConnection
generally: that it is an object which you might expect to beCloseable
but in fact it is itsinputStream
which must beclose
d. In JDK-6956385, even when the caller does specifically callInputStream.close
, a file handle may be leaked.I managed to build the JDK on both Linux and (Cygwin) Windows to confirm the fix via
I also ran jtreg on Linux (this test and various others in nearby packages); on Windows the
make test
target just hung for some reason.I marked the test
othervm
out of caution, since it is mutating global state, and if it fails will leak a handle and prevent scratch dir cleanup on Windows.(This is my first contribution, at least after the move to GitHub, so let me know if something is missing here technically or stylistically. None of the contribution guides appear to be up to date.)
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/12871/head:pull/12871
$ git checkout pull/12871
Update a local copy of the PR:
$ git checkout pull/12871
$ git pull https://git.openjdk.org/jdk.git pull/12871/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 12871
View PR using the GUI difftool:
$ git pr show -t 12871
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/12871.diff
Webrev
Link to Webrev Comment