Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

6957241: ClassLoader.getResources() returns only 1 instance when using jar indexing #5316

Closed
wants to merge 2 commits into from

Conversation

shiyuexw
Copy link
Contributor

@shiyuexw shiyuexw commented Aug 31, 2021

Using jarIndex for Hibench, there is an unexpected behavior with the exception "Exception in thread "main" org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "hdfs"".

After investigating it, it is related to the usage of ServiceLoader with JarIndex.
The below stack shows the issue with JDK11:

getResource:1016, URLClassPath$JarLoader (jdk.internal.loader)
getResource:937, URLClassPath$JarLoader (jdk.internal.loader)
findResource:912, URLClassPath$JarLoader (jdk.internal.loader)
next:341, URLClassPath$1 (jdk.internal.loader)
hasMoreElements:351, URLClassPath$1 (jdk.internal.loader)
hasNext:355, BuiltinClassLoader$1 (jdk.internal.loader)
hasMoreElements:363, BuiltinClassLoader$1 (jdk.internal.loader)
next:3032, CompoundEnumeration (java.lang)
hasMoreElements:3041, CompoundEnumeration (java.lang)
nextProviderClass:1203, ServiceLoader$LazyClassPathLookupIterator (java.util)
hasNextService:1221, ServiceLoader$LazyClassPathLookupIterator (java.util)
hasNext:1265, ServiceLoader$LazyClassPathLookupIterator (java.util)
hasNext:1300, ServiceLoader$2 (java.util)
hasNext:1385, ServiceLoader$3 (java.util)

The below API tries to get all the resources with the same name.

public Enumeration<URL> findResources(final String name,
                                     final boolean check) 

After using JarIndex, URLClassPath.findResources only returns 1 URL.
It is the same as the description in JDK-6957241.

The issue still exists in JDK18.

Root cause:

public Enumeration<URL> findResources(final String name,
                                     final boolean check) {
        return new Enumeration<>() {
            private int index = 0;
            private URL url = null;

            private boolean next() {
                if (url != null) {
                    return true;
                } else {
                    Loader loader;
                    while ((loader = getLoader(index++)) != null) {
                        url = loader.findResource(name, check);
                        if (url != null) {
                            return true;
                        }
                    }
                    return false;
                }
            }
...
        };
    }

With the JarIndex, there is only one loader which is corresponding to the jar with the index due to the implementation in JarLoader.getResource(final String name, boolean check, Set visited).

Loaders corresponding to other jar packages will not appear in this while.
So it only returns 1 instance.

To solve the issue, I change the implementation "private boolean next()".
If the loader has index, traverse the index and get all the resource from the loader.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-6957241: ClassLoader.getResources() returns only 1 instance when using jar indexing

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/5316/head:pull/5316
$ git checkout pull/5316

Update a local copy of the PR:
$ git checkout pull/5316
$ git pull https://git.openjdk.java.net/jdk pull/5316/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 5316

View PR using the GUI difftool:
$ git pr show -t 5316

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/5316.diff

@bridgekeeper bridgekeeper bot added the oca label Aug 31, 2021
@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Aug 31, 2021

Hi @shiyuexw, welcome to this OpenJDK project and thanks for contributing!

We do not recognize you as Contributor and need to ensure you have signed the Oracle Contributor Agreement (OCA). If you have not signed the OCA, please follow the instructions. Please fill in your GitHub username in the "Username" field of the application. Once you have signed the OCA, please let us know by writing /signed in a comment in this pull request.

If you already are an OpenJDK Author, Committer or Reviewer, please click here to open a new issue so that we can record that fact. Please use "Add GitHub user shiyuexw" as summary for the issue.

If you are contributing this work on behalf of your employer and your employer has signed the OCA, please let us know by writing /covered in a comment in this pull request.

@shiyuexw
Copy link
Contributor Author

@shiyuexw shiyuexw commented Aug 31, 2021

/covered

@openjdk
Copy link

@openjdk openjdk bot commented Aug 31, 2021

⚠️ @shiyuexw a branch with the same name as the source branch for this pull request (master) is present in the target repository. If you eventually integrate this pull request then the branch master in your personal fork will diverge once you sync your personal fork with the upstream repository.

To avoid this situation, create a new branch for your changes and reset the master branch. You can do this by running the following commands in a local repository for your personal fork. Note: you do not have to name the new branch NEW-BRANCH-NAME.

$ git checkout -b NEW-BRANCH-NAME
$ git branch -f master 16e83058cab4dd4d4a3fba812c8fe50e4286bd22
$ git push -f origin master

Then proceed to create a new pull request with NEW-BRANCH-NAME as the source branch and close this one.

@bridgekeeper bridgekeeper bot added the oca-verify label Aug 31, 2021
@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Aug 31, 2021

Thank you! Please allow for a few business days to verify that your employer has signed the OCA. Also, please note that pull requests that are pending an OCA check will not usually be evaluated, so your patience is appreciated!

@openjdk
Copy link

@openjdk openjdk bot commented Aug 31, 2021

@shiyuexw The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the core-libs label Aug 31, 2021
@bridgekeeper bridgekeeper bot removed oca oca-verify labels Aug 31, 2021
@openjdk openjdk bot added the rfr label Aug 31, 2021
@mlbridge
Copy link

@mlbridge mlbridge bot commented Aug 31, 2021

Webrevs

@AlanBateman
Copy link
Contributor

@AlanBateman AlanBateman commented Sep 1, 2021

Thanks for the PR/patch. I think we need a discussion on core-libs-dev as to whether to keep the legacy JAR index support. The original motivation was applets when it was added in JDK 1.3. I don't think findResources has ever worked with the JAR index for cases where the same resource is in more than one JAR files. I'm not opposed to fixing it but it does add complexity and will likely refactoring the UCP implementation for something that will likely go away eventually anyway.

@dfuch
Copy link
Member

@dfuch dfuch commented Sep 1, 2021

If there is a way to simplify the UCP code by removing the support for legacy JAR index, and if it doesn't cause regressions, I'm all for it. But given the intricacy of that code I suspect it will be quite an undertaking - and reviewing such a change will probably not be trivial. The current PR is not trivial to review and seems to be adding an other layer of complexity though, so I agree that we should have this discussion.

@AlanBateman
Copy link
Contributor

@AlanBateman AlanBateman commented Sep 2, 2021

@wxiang I think there is at least some support for removing the JAR indexing support rather than trying to fix findResources. The issue of what to do with the legacy JAR index mechanism came up during JDK 9 in the context of modular JARs and also Multi-Release JARs but it was too much to take on at the time. Would you be interested in working out the changes to remove it?

@shiyuexw
Copy link
Contributor Author

@shiyuexw shiyuexw commented Sep 3, 2021

@AlanBateman Sure, I am interested in it.

@AlanBateman
Copy link
Contributor

@AlanBateman AlanBateman commented Sep 3, 2021

@AlanBateman Sure, I am interested in it.

Great! I think there are several parts to this. The removal of the JAR index support from the URLClassLoader implementation, the jar i option, the JAR file spec, and the jar tool man page.

It would be good to create a patch for the removal to see if there are any issues. There will probably need to be some discussion on what to do with the jar tool. I suspect we will need to keep the code that updates the index when updating a JAR file that has an existing index, this means keeping JarIndex and maybe moving it to the jdk.jartool module. We can change jar i to print a warning when the tool is called to generate an index. Don't worry about the JAR spec and man page for now.

@stuart-marks
Copy link
Member

@stuart-marks stuart-marks commented Sep 3, 2021

There will probably need to be some discussion on what to do with the jar tool. I suspect we will need to keep the code that updates the index when updating a JAR file that has an existing index, this means keeping JarIndex....

Depends on how draconian we're willing to be. At some point I can see the tool removing the index if it's updating a jar file that has one. Suitable warning messages would need to be emitted.

@shiyuexw
Copy link
Contributor Author

@shiyuexw shiyuexw commented Sep 4, 2021

@AlanBateman Sure, I am interested in it.

Great! I think there are several parts to this. The removal of the JAR index support from the URLClassLoader implementation, the jar i option, the JAR file spec, and the jar tool man page.

It would be good to create a patch for the removal to see if there are any issues. There will probably need to be some discussion on what to do with the jar tool. I suspect we will need to keep the code that updates the index when updating a JAR file that has an existing index, this means keeping JarIndex and maybe moving it to the jdk.jartool module. We can change jar i to print a warning when the tool is called to generate an index. Don't worry about the JAR spec and man page for now.

I will first create the patch to remove JAR index support from the URLClassLoader implementation, the jar i option.

@AlanBateman
Copy link
Contributor

@AlanBateman AlanBateman commented Sep 4, 2021

I will first create the patch to remove JAR index support from the URLClassLoader implementation, the jar i option.

Thank you. We'll probably need a new JBS issue and PR for that but let's see first if any issues come out of the wood work. Stuart is right that another option for the jar tool is to drop the index when updating an existing JAR file that has an index, we don't need to decide that just yet.

@mlbridge
Copy link

@mlbridge mlbridge bot commented Sep 4, 2021

Mailing list message from Lance Andersen on core-libs-dev:

On Sep 4, 2021, at 2:53 AM, Alan Bateman <alanb at openjdk.java.net<mailto:alanb at openjdk.java.net>> wrote:

On Sat, 4 Sep 2021 01:53:35 GMT, wxiang <github.com+53162078+shiyuexw at openjdk.org<mailto:github.com+53162078+shiyuexw at openjdk.org>> wrote:

I will first create the patch to remove JAR index support from the URLClassLoader implementation, the `jar i` option.

Thank you. We'll probably need a new JBS issue and PR for that but let's see first if any issues come out of the wood work. Stuart is right that another option for the jar tool is to drop the index when updating an existing JAR file that has an index, we don't need to decide that just yet.

Perhaps the jar validate option could/should be updated to flag when an index is there?

We could add a warning for JDK 18 when main::genIndex is invoked as I assume we would want to at least wait until JDK19 to remove this functionality?

Best
Lance

-------------

PR: https://git.openjdk.java.net/jdk/pull/5316

[cid:E1C4E2F0-ECD0-4C9D-ADB4-B16CA7BCB7FC at home]

Lance Andersen| Principal Member of Technical Staff | +1.781.442.2037
Oracle Java Engineering
1 Network Drive
Burlington, MA 01803
Lance.Andersen at oracle.com<mailto:Lance.Andersen at oracle.com>

@mlbridge
Copy link

@mlbridge mlbridge bot commented Sep 5, 2021

Mailing list message from Alan Bateman on core-libs-dev:

On 04/09/2021 12:12, Lance Andersen wrote:

Perhaps the jar validate option could/should be updated to flag when
an index is there?

We could add a warning for JDK 18 when main::genIndex is invoked as I
assume we would want to at least wait until JDK19 to remove this
functionality?

I'm not sure about jar --validate complaining if META-INF/INDEX.LIST is
present. It may be useful if it were to check that the index is correct
but it would need the entries listed in the Class-Path attribute to be
found.

I don't think we really need to do anything to the jar tool in the short
term, except just to move JarIndex to the jdk.jartool module. I think
the more important part is removing the JAR index support from the
runtime. I don't expect any issues with the URLClassLoader
implementation but it is possible that something comes out of the wood
work with signed JARs (JarVerifier).

-Alan

@shiyuexw
Copy link
Contributor Author

@shiyuexw shiyuexw commented Sep 7, 2021

I created a new JBS https://bugs.openjdk.java.net/browse/JDK-8273401, and PR: #5383.
The PR is to remove the JarIndex support in URLClassPath and move JarIndex into the jdk.jartool module.

@shiyuexw
Copy link
Contributor Author

@shiyuexw shiyuexw commented Sep 27, 2021

JarIndex support will be deleted. close the PR.

@shiyuexw shiyuexw closed this Sep 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs rfr
4 participants