Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LUCENE-10335: Deprecate broken helper methods not compatible with module system; add ModuleResourceLoader and add integration test #567

Merged
merged 21 commits into from
Jan 3, 2022

Conversation

uschindler
Copy link
Contributor

@uschindler
Copy link
Contributor Author

I have not yet checked Luke's image loading, as I touched this while reviewing all calls to getResource()/getResourceAsStream().

@uschindler
Copy link
Contributor Author

I fixed Luke. I will now check for other places that do getClassLoader().getResource...

@uschindler
Copy link
Contributor Author

We have some more in Benchmark module and tests of Spatial module. We may need to fix those and move resources to exported packages (not a root folder called "data/").

Copy link
Contributor

@dweiss dweiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@uschindler
Copy link
Contributor Author

I found another problematic one: loadStopwordSet in StopAnalyzerBase. This does not work from inside nor/kuromoji and others. I will deprecated in same way.

I would like to make this method only visible from the analyzers-common module, but I am not sure how.

@rmuir
Copy link
Member

rmuir commented Dec 23, 2021

How can we have failing tests for these problems? We need a test-driven approach to fixing this stuff. gradlew check is totally happy for me.

@uschindler
Copy link
Contributor Author

uschindler commented Dec 23, 2021

How can we have failing tests for these problems? We need a test-driven approach to fixing this stuff. gradlew check is totally happy for me.

This is preparatory. We have a branch where it does not work because of this.

See the issue for explanation. The problem is that getResourceAsStream() and getResource() are caller-sensitive and check who calls it. If its a classpath app, it works as the checks are ignored. With module system the method returns null if you try to load a resource in another module. I will add a test to the distribution tests to show this.

The basic problem is:
analyzer module calls into core module and core module tries to load resource from analyzers module. So the indirection analyzer -> core -> analyzer must go away

@uschindler
Copy link
Contributor Author

To explain: We need to go away with a public static helper class that calls Class#getResourceAsStream() on behalf of somebody else.

@rmuir
Copy link
Member

rmuir commented Dec 23, 2021

I understand why the method doesnt work for modules: I just don't care.

I am only asking about failing tests.

@uschindler
Copy link
Contributor Author

@rmuir we will have a test-driven approach soon after xmas when Dawid finishes the conversion to module system. But all work that can be done before should be done before, as it will be a messy huge patch.

@uschindler
Copy link
Contributor Author

I understand why the method doesnt work for modules: I just don't care.

I am only asking about failing tests.

For now I can only make sure I don't break the status as is.

@rmuir
Copy link
Member

rmuir commented Dec 23, 2021

Also, are we sure this stuff isn't bugs in the JDK? Seems like a design flaw in modules.

I have the feeling this is going to break the analysis module left and right. I am concerned about things like analyzer factories, stopwordananalyzerbase, the list goes on and on. I cannot "see" the scope of the problem without failing tests LOL.

@rmuir
Copy link
Member

rmuir commented Dec 23, 2021

The problem is that getResourceAsStream() and getResource() are caller-sensitive and check who calls it.

That's openjdk internal bullshit and not in the public java docs. we need to push back on openjdk here, this is a bug!

@rmuir
Copy link
Member

rmuir commented Dec 23, 2021

Sorry, I have to say I am -1 at the moment to "gutting" all the analysis code, without tests, before first even trying to report these bugs to openjdk (it is THEIR FAULT). We should get it fixed upstream.

@uschindler
Copy link
Contributor Author

uschindler commented Dec 23, 2021

Also, are we sure this stuff isn't bugs in the JDK? Seems like a design flaw in modules.

I was in the discussion before Java 9 came out. It was a mess and I argued several times, because you were not even able to load class files anymore as resources (now they have an exception for that specific use case in ClassLoader). Class#getResourceAsStream() (and getResource, too) was changed several times to work around some problems with legacy apps.

At the end we came out with this and it is clearly documented in the JLS and the API and you won't change it anymore:

  • a Class instance located in unnamed modules (classpath) can be used to load any resource (it's the backwards compatibility layer)
  • a Class instance for a class located in a named module can only load resoures that the caller of getResourceAsStream can see (exported packages). This won't be problematic for analyzers module unless we use static utility methods.

In the analyzers module it is not a problem anymore, only a few issues are there.

@rmuir
Copy link
Member

rmuir commented Dec 23, 2021

Where is it clearly documented? I don't see any such documentation: BUG.

@uschindler
Copy link
Contributor Author

Robert, please let me finish this, otherwise I will trash down everything and stop working anymore on module system. If you want modules (and looks like you want) let us do our work and be calm. Thanks.

I will work more and commit this a bit later. These changes are not risky at all and IMHO I fixed a lot of other stuff around.

@rmuir
Copy link
Member

rmuir commented Dec 23, 2021

My problem is that the justification for fixing this stuff is that, "this code is buggy, it calls a CallerSensitive method".

CallerSensitive isn't even publicly documented. This is OpenJDK internal stuff. OpenJDK BUGS!

Uwe, I don't meant to discourage you from working on modules. It isn't a personal attack. I am angry about the OpenJDK behavior here and honestly believe this stuff is a bug.

@uschindler
Copy link
Contributor Author

uschindler commented Dec 23, 2021

Hi, see the bold text and especially the italic one: (https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/lang/Class.html#getResourceAsStream(java.lang.String)):

Finds a resource with a given name.

If this class is in a named Module then this method will attempt to find the resource in the module. This is done by delegating to the module's class loader findResource(String,String) method, invoking it with the module name and the absolute name of the resource. Resources in named modules are subject to the rules for encapsulation specified in the Module getResourceAsStream method and so this method returns null when the resource is a non-".class" resource in a package that is not open to the caller's module.

Otherwise, if this class is not in a named module then the rules for searching resources associated with a given class are implemented by the defining class loader of the class. This method delegates to this object's class loader. If this object was loaded by the bootstrap class loader, the method delegates to ClassLoader.getSystemResourceAsStream(java.lang.String).

@rmuir
Copy link
Member

rmuir commented Dec 23, 2021

Thanks Uwe. I suppose "caller" there is enough to justify their shitty CallerSensitive shenanigans. This really sucks though, they screwed up.

Please proceed, sorry for noise.

@uschindler
Copy link
Contributor Author

Thanks Uwe. I suppose "caller" there is enough to justify their shitty CallerSensitive shenanigans. This really sucks though, they screwed up.

I was angry about that 4 or 5 years ago already. By the way the exception for "class" files was my work at that time.

@uschindler
Copy link
Contributor Author

I am thinking about a small solution to still use StopAnalyzerBase#loadStopwordSet(...,Class,...) and not duplicate code everywhere like in current patch.

@rmuir
Copy link
Member

rmuir commented Dec 23, 2021

I am thinking about a small solution to still use StopAnalyzerBase#loadStopwordSet(...,Class,...) and not duplicate code everywhere like in current patch.

But how will we fix analyzer factories (ClasspathResourceLoader) ? This seems like the biggest issue.

@uschindler
Copy link
Contributor Author

Hi @mocobeta,
I cherrypicked your commit and modified it a bit: To still support full names, it just prefixes all paths with "/". After doing this I was able to revert the test changes which shows that its 100% backwards compatible.

In general a simple hotfix rule for Class#getClassLoader().getResource(xxx) is: Replace by Class#getResource("/".concat(xxx))

The main difference between ClassLoader and Class ist: For Class you can load non-open module-local resources, while ClassLoader ALWAYS needs open packages, also for callers from own module. The reason for that is simple: A ClassLoader can be shared between different modules, so the getResource() methods there have no idea from which module you want to get resources, so they block everything which is not open.

So we should really forbid usage of ClassLoader#getResource...() (separate issue).

@uschindler uschindler marked this pull request as ready for review December 31, 2021 12:05
@uschindler
Copy link
Contributor Author

uschindler commented Dec 31, 2021

In general a simple hotfix rule for Class#getClassLoader().getResource(xxx) is: Replace by Class#getResource("/".concat(xxx))

An alternative is Class#getModule().getResourceAsStream(xxx), which has similar behaviour like old code (but is limited to one module).

But not for the Kuromoji/Nori code, as this would limit to only current module so breaking backwards.

@mocobeta
Copy link
Contributor

@uschindler Thank you. I looked through the changes.
I can't look closely at or review the whole change soon since I have to leave for a while (my father just passed away due to old age and lots of work is waiting for me), so please go ahead. I hope I'll be able to catch up on the progress of the jms support later.

@uschindler
Copy link
Contributor Author

Hi,

@uschindler Thank you. I looked through the changes. I can't look closely at or review the whole change soon since I have to leave for a while (my father just passed away due to old age and lots of work is waiting for me), so please go ahead. I hope I'll be able to catch up on the progress of the jms support later.

That's really bad news! I send you my deepest condolences! Take your time!

Most of this PR is mechanical work, so the only things to "review" are the API changes. I am sure, Dawid will also have a look.

Happy New Year,
Uwe

@mocobeta
Copy link
Contributor

Thank you, Uwe. I am grateful for your kind concern.
Wish you all a Happy New Year!

@rmuir
Copy link
Member

rmuir commented Dec 31, 2021

@uschindler Thank you. I looked through the changes.
I can't look closely at or review the whole change soon since I have to leave for a while (my father just passed away due to old age and lots of work is waiting for me), so please go ahead. I hope I'll be able to catch up on the progress of the jms support later.

I'm sorry to hear this @mocobeta

@mocobeta
Copy link
Contributor

mocobeta commented Jan 1, 2022

@rmuir Thank you, for your concern.

@dweiss
Copy link
Contributor

dweiss commented Jan 1, 2022

I am very sorry for your loss, @mocobeta, please accept my condolences.

@mocobeta
Copy link
Contributor

mocobeta commented Jan 1, 2022

@dweiss Thank you, I appreciate your concern.

@uschindler uschindler merged commit cc342ea into apache:main Jan 3, 2022
@uschindler uschindler deleted the jira/LUCENE-10335 branch January 3, 2022 09:38
@uschindler uschindler restored the jira/LUCENE-10335 branch January 3, 2022 09:40
@uschindler
Copy link
Contributor Author

Hi @dweiss: I will merge in the changes to your branch, as this is a bit more complicated (I also need to revert some changes on your branch)

@dweiss
Copy link
Contributor

dweiss commented Jan 3, 2022

No, please don't - I'll do it later.

@uschindler
Copy link
Contributor Author

Oh sorry, I fixed it already. You can revert if you like, but looks well now.

@dweiss
Copy link
Contributor

dweiss commented Jan 3, 2022

No worries, thanks Uwe.

@uschindler
Copy link
Contributor Author

Hm, now on main this causes an ECJ warning.... But it does not fail build. That's ok for now.

@uschindler
Copy link
Contributor Author

> Task :lucene:core.tests:ecjLintTest
----------
1. WARNING in C:\Users\Uwe Schindler\Projects\lucene\lucene\lucene\core.tests\src\test\module-info.java (at line 27)
        opens org.apache.lucene.core.testresources to
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The package org.apache.lucene.core.testresources does not exist or is empty
----------
1 problem (1 warning)

I will check how to fix this. Its a valid warning because the package has no class files. But that's wanted here for testing purposes.

@uschindler
Copy link
Contributor Author

I fixed the warning in main by a fake "package-info.java"

@mocobeta
Copy link
Contributor

I can't look closely at or review the whole change soon since I have to leave for a while (my father just passed away due to old age and lots of work is waiting for me), so please go ahead.

I'm sorry for the sudden notice - I was not sure how to properly inform such things in English - but we have finished the funeral and important memorial services, and I'm back to day-to-day life (I will have to deal with the various paperwork for the next months, though).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants