Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove ByteBufferIndexInput and update all Panama implementations (MMap and Vector) to Java 21 #13146

Merged
merged 18 commits into from
Feb 29, 2024

Conversation

uschindler
Copy link
Contributor

This PR updates the MR-JAR parts to only have implementations of Java 21.

This PR does not remove the sourceSets for Java 21, although this is also our base version:

  • When compiling vector we need a apijar anyways as APIs change in every java release.
  • In Java 21, MemorySegment is still a preview API, so we need to compile against the apijar, too.

Because compiling against the APIJAR is a hack, we do not want to do this for the main sourceset. So this one still has a separate sourceset with the Java 21 classes of vector and memorysegment.

In the current state the Java 21 classes are still put into a MR-JAR part. I don't want to remove this for now:

  • We will need more vector source sets in future (not yet for Java 22)

We could merge the Java 21 classes into the main part of the JAR file. The Gradle code could just compare the base version with the MR-JAR sourceset and if the version is identical (minJavaVersion==sourcesetVersion) it could copy the files into the main part of the JAR.

I will try this in a separate commit.

@uschindler uschindler added this to the 10.0.0 milestone Feb 29, 2024
@uschindler uschindler self-assigned this Feb 29, 2024
@uschindler
Copy link
Contributor Author

Because this also updated ASM to correct versions, I changed the JavascriptCompiler to use Java 21 class file format.

Copy link
Contributor

@dweiss dweiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice cleanup.

@uschindler
Copy link
Contributor Author

Hi,
I will make a separate PR for this: At moment the Java 21 separate sourceSet is a MRJAR version section. At the moment we would not need a MR-JAR, as we can merge the classes together, so I propose to add this code:

    tasks.named('jar').configure {
      boolean needMRJAR = false;
      mrjarJavaVersions.each { jdkVersion ->
        boolean isBaseVersion = (jdkVersion.toString() == rootProject.minJavaVersion.toString())
        into(isBaseVersion ? '' : "META-INF/versions/${jdkVersion}") {
          from sourceSets["main${jdkVersion}"].output
        }
        needMRJAR |= !isBaseVersion
      }

      if (needMRJAR) {
        manifest.attributes(
          'Multi-Release': 'true'
        )
      }
    }

This copies the files from the main21 sourceset to the JAR's main folder. It only enables Multi-Release: true manifest entry, if there is another java version != the base java version (21) available.

@@ -50,9 +50,6 @@ grant {
permission java.lang.RuntimePermission "getStackTrace";
// needed for mock filesystems in tests
permission java.lang.RuntimePermission "fileSystemProvider";
// needed to test unmap hack on platforms that support it
permission java.lang.RuntimePermission "accessClassInPackage.sun.misc";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my favourite change of the whole PR!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!! 👍

public void testAceWithThreads() throws Exception {
assumeTrue("Test requires MemorySegmentIndexInput", isMemorySegmentImpl());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is my favorite part of the whole PR

Copy link
Member

@rmuir rmuir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

finally, SIGSEGV becomes impossible from MmapDirectory. Thank you Uwe!

@@ -157,7 +157,7 @@ public static FSDirectory open(Path path) throws IOException {

/** Just like {@link #open(Path)}, but allows you to also specify a custom {@link LockFactory}. */
public static FSDirectory open(Path path, LockFactory lockFactory) throws IOException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: there are still some references to the unmap-hack in the class javadocs of this file. It's gotta feel great nuking these warnings :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will grep through the code :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 0f146ac. Thanks Robert!

@rmuir
Copy link
Member

rmuir commented Mar 1, 2024

I'm not sure how you debugged that!

I tried to debug what gradle is doing with this tests counter (hey, a stacktrace of the offending setAccessible would be nice), but I think @dweiss has seen this before, it is not really possible. As soon as you set java.security.debug on the test JVM, gradle test runner dies because it doesn't like the stderr prints or something?

I tried to just add something like this to tests.jvmargs:

-Djava.security.debug=access,failure,permission=java.lang.reflect.ReflectPermission

Trying to debug just crashes the tests instead, it hits StackOverflowError...

OpenJDK 64-Bit Server VM warning: Potentially dangerous stack overflow in ReservedStackAccess annotated method java.io.BufferedWriter.write(Ljava/lang/String;II)V [1]
org.gradle.internal.event.ListenerNotificationException: Failed to notify output event listener.
...
Caused by: java.lang.StackOverflowError
...

@dweiss
Copy link
Contributor

dweiss commented Mar 1, 2024

It's this issue -
gradle/gradle#11609

they closed the issue but it's still not working as expected.

@uschindler
Copy link
Contributor Author

uschindler commented Mar 1, 2024

I'm not sure how you debugged that!

I did not debug that; it was an observation and then try/error. I was a bit annoyed yesterday so here is my observations:

  • Jenkins did not fail (so it looks like on Jenkins/CI actions and the tests are not running with animated status display), all works fine and at end Gradle knows how many tests were executed and the condition "fail if number of executed tests is zero" does not trigger. I cannot confirm that, but for some reason when running tests in CI this is not a problem.
  • If you run tests in your console with colored display and those Gradle animations, there was the following observation: The tests were running successfully, but the first line of Gradle output where the test counter is displayed stays at "0 tests". At end the build fails with error "no tests ran", although it spend minutes on (successfully) executing all test. If you enabled verbose test output, no execptions, nothing! -- but every test at end shows a reproduce line, but no failure error message or any stack trace.

I tried to debug what gradle is doing with this tests counter (hey, a stacktrace of the offending setAccessible would be nice), but I think @dweiss has seen this before, it is not really possible. As soon as you set java.security.debug on the test JVM, gradle test runner dies because it doesn't like the stderr prints or something?

No idea. I was shortly before jumping out of the windows last night. I did not try to debug, I just reverted test changes step by step until the policy file came to my mind....

After reverting the changes in policy, the test counter was incrementing again. I did not do any further investigations or debugging. I was just annoyed and angry and very tired (2:30 in the morning).

This is all bad, no debugging possible.

Copy link
Contributor

@ChrisHegarty ChrisHegarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Belated LGTM.

@@ -50,9 +50,6 @@ grant {
permission java.lang.RuntimePermission "getStackTrace";
// needed for mock filesystems in tests
permission java.lang.RuntimePermission "fileSystemProvider";
// needed to test unmap hack on platforms that support it
permission java.lang.RuntimePermission "accessClassInPackage.sun.misc";
permission java.lang.reflect.ReflectPermission "suppressAccessChecks";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately this one had to be reverted :-(

@dweiss
Copy link
Contributor

dweiss commented Mar 1, 2024

I did track it down to gradle's test listeners - it looks like a bug to me. Whatever this delegation is doing, even a simple check on whether the target method is already accessible (or even public) would be sufficient for this to work.

image

Oh well.

@uschindler
Copy link
Contributor Author

uschindler commented Mar 1, 2024

I did track it down to gradle's test listeners - it looks like a bug to me. Whatever this delegation is doing, even a simple check on whether the target method is already accessible (or even public) would be sufficient for this to work.

Oh well.

And on top of this, as usual AccessController#doPrivileged is missing, which is the main bug here. With AccessController#doProvileged, the policy file would assign the AllPermission, which was added for Gradle's Test Runner JAR codesource. But without the AccessController, the top-level caller (our test case) is then hit by the more restricted permission.

Oh well! puke

P.S.: Where is the exception swallowed?

@ChrisHegarty
Copy link
Contributor

ChrisHegarty commented Mar 4, 2024

I had trouble identifying the root cause of the need for the security permission grant, and setting the java security debug property just made things fail in a different and unhelpful way, so I reproduced with a "modified" JDK - that emits the security debugging output to a temp file (rather than System.err). Here's a few stacktraces, showing where this fails, for me:

access: domain that failed ProtectionDomain  (file:/Users/chegar/.gradle/caches/modules-2/files-2.1/junit/junit/4.13.1/cdd00374f1fee76b11e2a9d127405aa3f6be5b6a/junit-4.13.1.jar <no signer certificates>)
 jdk.internal.loader.ClassLoaders$AppClassLoader@54bedef2
 <no principals>
 java.security.Permissions@2d52216b (
 ("java.lang.RuntimePermission" "exitVM")
 ("java.io.FilePermission" "/Users/chegar/.gradle/caches/modules-2/files-2.1/junit/junit/4.13.1/cdd00374f1fee76b11e2a9d127405aa3f6be5b6a/junit-4.13.1.jar" "read")
)

        
java.lang.Exception: Stack trace
        at java.base/java.security.AccessControlContext.checkPermission(AccessControlContext.java:435)
        at java.base/java.security.AccessController.checkPermission(AccessController.java:1085)
        at java.base/java.lang.SecurityManager.checkPermission(SecurityManager.java:411)
        at java.base/java.lang.Class.checkMemberAccess(Class.java:3227)
        at java.base/java.lang.Class.getDeclaredFields(Class.java:2540)
        at junit@4.13.1/org.junit.runners.model.TestClass.getSortedDeclaredFields(TestClass.java:77)
        at junit@4.13.1/org.junit.runners.model.TestClass.scanAnnotatedMembers(TestClass.java:70)
        at junit@4.13.1/org.junit.runners.model.TestClass.<init>(TestClass.java:57)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$12.run(RandomizedRunner.java:1092)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$12.run(RandomizedRunner.java:1089)
        at java.base/java.security.AccessController.doPrivileged(AccessController.java:319)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner.getAnnotatedFieldValues(RandomizedRunner.java:1089)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner.wrapMethodRules(RandomizedRunner.java:1075)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:952)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
        at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
        at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
        at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
        at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
        at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
        at junit@4.13.1/org.junit.rules.RunRules.evaluate(RunRules.java:20)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850)
        at java.base/java.lang.Thread.run(Thread.java:1583)
access: access denied ("java.lang.reflect.ReflectPermission" "suppressAccessChecks")
access: domain that failed ProtectionDomain  (file:/Users/chegar/.gradle/caches/modules-2/files-2.1/junit/junit/4.13.1/cdd00374f1fee76b11e2a9d127405aa3f6be5b6a/junit-4.13.1.jar <no signer certificates>)
 jdk.internal.loader.ClassLoaders$AppClassLoader@54bedef2
 <no principals>
 java.security.Permissions@2d52216b (
 ("java.lang.RuntimePermission" "exitVM")
 ("java.io.FilePermission" "/Users/chegar/.gradle/caches/modules-2/files-2.1/junit/junit/4.13.1/cdd00374f1fee76b11e2a9d127405aa3f6be5b6a/junit-4.13.1.jar" "read")
)


java.lang.Exception: Stack trace
        at java.base/java.security.AccessControlContext.checkPermission(AccessControlContext.java:435)
        at java.base/java.security.AccessController.checkPermission(AccessController.java:1085)
        at java.base/java.lang.SecurityManager.checkPermission(SecurityManager.java:411)
        at java.base/java.lang.Class.checkMemberAccess(Class.java:3227)
        at java.base/java.lang.Class.getDeclaredMethods(Class.java:2674)
        at junit@4.13.1/org.junit.internal.MethodSorter.getDeclaredMethods(MethodSorter.java:54)
        at junit@4.13.1/org.junit.runners.model.TestClass.scanAnnotatedMembers(TestClass.java:65)
        at junit@4.13.1/org.junit.runners.model.TestClass.<init>(TestClass.java:57)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$12.run(RandomizedRunner.java:1092)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$12.run(RandomizedRunner.java:1089)
        at java.base/java.security.AccessController.doPrivileged(AccessController.java:319)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner.getAnnotatedFieldValues(RandomizedRunner.java:1089)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner.wrapMethodRules(RandomizedRunner.java:1075)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:952)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
        at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
        at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
        at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
        at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
        at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
        at junit@4.13.1/org.junit.rules.RunRules.evaluate(RunRules.java:20)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850)
        at java.base/java.lang.Thread.run(Thread.java:1583)
access: access denied ("java.lang.reflect.ReflectPermission" "suppressAccessChecks")
access: domain that failed ProtectionDomain  (file:/Users/chegar/git/tmp/lucene/lucene/test-framework/build/libs/lucene-test-framework-10.0.0-SNAPSHOT.jar <no signer certificates>) jdk.internal.loader.ClassLoaders$AppClassLoader@54bedef2
 <no principals>
 java.security.Permissions@45970c73 (
 ("java.lang.RuntimePermission" "exitVM")
 ("java.io.FilePermission" "/Users/chegar/git/tmp/lucene/lucene/test-framework/build/libs/lucene-test-framework-10.0.0-SNAPSHOT.jar" "read")
)


java.lang.Exception: Stack trace
        at java.base/java.security.AccessControlContext.checkPermission(AccessControlContext.java:435)
        at java.base/java.security.AccessController.checkPermission(AccessController.java:1085)
        at java.base/java.lang.SecurityManager.checkPermission(SecurityManager.java:411)
        at java.base/java.lang.reflect.AccessibleObject.checkPermission(AccessibleObject.java:92)
        at java.base/java.lang.reflect.Method.setAccessible(Method.java:196)
        at junit@4.13.1/org.junit.runners.model.FrameworkMethod.<init>(FrameworkMethod.java:35)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner.wrapMethodRules(RandomizedRunner.java:1064)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:952)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
        at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
        at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
        at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
        at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
        at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
        at junit@4.13.1/org.junit.rules.RunRules.evaluate(RunRules.java:20)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850)
        at java.base/java.lang.Thread.run(Thread.java:1583)

@ChrisHegarty
Copy link
Contributor

Adding explicit grants to the above 3 identified codebases (randomizedtesting-runner, junit, and lucene-test-framework), allows the test count to work (for me), e.g. (quickly hacked with hardcoded paths)

grant codeBase "file:/Users/chegar/.gradle/caches/modules-2/files-2.1/com.carrotsearch.randomizedtesting/randomizedtesting-runner/2.8.1/55ffe691e90d31ab916746516654b5701e532d6f/randomizedtesting-runner-2.8.1.jar" {
  permission java.lang.reflect.ReflectPermission "suppressAccessChecks";
};

grant codeBase "file:/Users/chegar/.gradle/caches/modules-2/files-2.1/junit/junit/4.13.1/cdd00374f1fee76b11e2a9d127405aa3f6be5b6a/junit-4.13.1.jar" {
  permission java.lang.reflect.ReflectPermission "suppressAccessChecks";
};

grant codeBase "file:/Users/chegar/git/tmp/lucene/lucene/test-framework/build/libs/lucene-test-framework-10.0.0-SNAPSHOT.jar" {
  permission java.lang.reflect.ReflectPermission "suppressAccessChecks";
};

@uschindler
Copy link
Contributor Author

Cool. So it is not gradle, which breaks.

Does counting while the tests are running works then?

@uschindler
Copy link
Contributor Author

Basically we would need one more AccessController #doPrivileged for the third stack trace. The first two would make it enough to have the randomized runner be whitelisted.

@dweiss
Copy link
Contributor

dweiss commented Mar 4, 2024

Have you looked at the test output xmls, Chris? Did the tests actually execute?

There is a doPrivileged wrapper in RR - I'm not sure why it requires permissions, it shouldn't.
https://github.com/randomizedtesting/randomizedtesting/blob/89643472e34aff0bba5b02897d9d6443bdc2e63b/randomized-runner/src/main/java/com/carrotsearch/randomizedtesting/RandomizedRunner.java#L1089-L1094

@dweiss
Copy link
Contributor

dweiss commented Mar 4, 2024

I can reproduce your results (tests do execute, even if you exclude randomizedtesting jar - the count doesn't show up in gradle properly then though).

@dweiss
Copy link
Contributor

dweiss commented Mar 4, 2024

Things like this:

        at java.base/java.lang.reflect.Method.setAccessible(Method.java:196)
        at junit@4.13.1/org.junit.runners.model.FrameworkMethod.<init>(FrameworkMethod.java:35)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner.wrapMethodRules(RandomizedRunner.java:1064)
        at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:952)

will be very difficult to predict - the runner interacts closely with junit in so many places that I think granting it a permission for reflection will be an easier workaround than trying to figure out where such calls can be made from within junit...

I'm also surprised none of the gradle callbacks violate the security permissions. Could it be that I stopped on that breakpoint in gradle's server-side code?! it is confusing.

@dweiss
Copy link
Contributor

dweiss commented Mar 4, 2024

Ok, I forgot we actually allow gradle to do anything:

// Grant all permissions to Gradle test runner classes.
grant codeBase "file:${gradle.lib.dir}${/}-" {
  permission java.security.AllPermission;
};

grant codeBase "file:${gradle.worker.jar}" {
  permission java.security.AllPermission;
};

I think I could try to locate places in RandomizedRunner where it calls into JUnit without doPrivileged... not sure if it's worth the hassle though - maybe adding permissions for just those three jars is fine (we can compute the URLs and pass them as properties)?

@rmuir
Copy link
Member

rmuir commented Mar 4, 2024

There are also some setAccessible() calls in our tests...

@uschindler
Copy link
Contributor Author

There are also some setAccessible() calls in our tests...

Die, die, die. 🤬😱🤨

@rmuir
Copy link
Member

rmuir commented Mar 4, 2024

We should ban it in forbidden-apis if not already... also some of the usages in tests should be reviewed.

Two of them are the JVM-crashing tests.
Maybe it is enough to just Runtime.halt() these days?
Some of this was paranoia around things such as shutdown-hooks closing open file handles and the like, I think it might be irrelevant?

The other is in RAMUsageTester, I don't understand it enough.

@ChrisHegarty
Copy link
Contributor

I think I could try to locate places in RandomizedRunner where it calls into JUnit without doPrivileged... not sure if it's worth the hassle though - maybe adding permissions for just those three jars is fine (we can compute the URLs and pass them as properties)?

Putting aside the test specific usages mentioned above, then... so long as there is no test code on the stack when setAccessible is called, those three jars should be sufficient. At least from what I see on my machine.

I do see the test count working fine with just those three jars (as well as the global grants removed). Here's an example from ./gradlew :lucene:core:test

Screenshot 2024-03-04 at 20 14 32

Copy link
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see CHANGES.txt updated; was it forgotten? Some tuning settings have disappeared now. org.apache.lucene.store.MMapDirectory.enableMemorySegments=false in particular is one I'm using in 9.8 to recover a big performance loss from an 8x upgrade. JDK 21. We'll eventually do more analysis to see what's going on. Perhaps we'll chat at Berlin Buzzwords soon.

@uschindler
Copy link
Contributor Author

Do you have highly concurrent close of index files? Die to the new features regarding safe close, the close is more expensive (especially for other threads concurrently accessing index files. This comes from a safe point for a thread local handshake to prevent access to the unmapped memory after close. This is a known limitation.

@uschindler
Copy link
Contributor Author

uschindler commented May 30, 2024

Changes entry is here:

* GITHUB#13146, GITHUB#13148: Remove ByteBufferIndexInput and only use MemorySegment APIs

It is Lucene 10 only.

@uschindler
Copy link
Contributor Author

See also discussion here: dacapobench/dacapobench#264 (comment)

@dsmiley
Copy link
Contributor

dsmiley commented May 30, 2024

Thanks Uwe! I suppose I blame the JDK then :-)
I question how bad the venerable ByteBufferIndexInput was to warrant removing it over the potential improvement value of better memory management with MemorySegment API. After all we have other Directory impls, etc. for a variety of use-cases/scenarios, why not ByteBufferIndexInput within MMapDirectory too?

@uschindler
Copy link
Contributor Author

Thanks Uwe! I suppose I blame the JDK then :-) I question how bad the venerable ByteBufferIndexInput was to warrant removing it over the potential improvement value of better memory management with MemorySegment API. After all we have other Directory impls, etc. for a variety of use-cases/scenarios, why not ByteBufferIndexInput within MMapDirectory too?

It was removed because it is unsafe (it can crash your JVM) and with current JEPs in development it will no longer work in JDK 24 because sun.misc.Unsafe will go away and therefor unmapping won't work anymore. So basically we get rid of unsafe and code which is no longer supported.

I am planning to open JDK issues because from the code, it should not slow too much. But it looks like of a side effect that it sometimes deoptimizes code on concurrent access on the safepoint. I will check about this with Maurizio. It could possibly only a bug in Java 21/22.

So actually this is the reason why we have the sysprop in 9.x so we can detect such bugs, report it to JDK and let it fix those bugs. So please share more information or a small bench showing the problem. In typical Elasticsearch use cases we have seen no slowdown, but more a speedup. But on the other hand we did not close the IndexInput 10 times per second while punching on them from 64 threads.

@uschindler
Copy link
Contributor Author

See https://openjdk.org/jeps/471

@uschindler
Copy link
Contributor Author

uschindler commented Jun 10, 2024

Hi @dsmiley,
Thanks for the quick talk on Berlinbuzzwords. Actually this looks like the same issue we have seen in the dacapobench.
When back at home I will try to write a JMH benchmark without Lucene code to reproduce the issue and measure how large the issue is (depending on how often you close an IndexInput and how many parallel threads are using other MemorySegments, also those not tied to same Indexinput).

My idea is:

  • Have many threads working on "hot" MemorySegments allocated with different shared Arenas.
  • Have a few threads closing those Arenas.

Theoretically, the benchmark should not slow if the few threads close Arenas which are not bound to and hot workers.

What we have seen in dacapo bench is that it seems to affect all MemorySegments negatively, not only those which are affected by the close. of Arena.

Once I have the benchmark I will open issue at OpenJDK.

In the meantime, we could "preserve" the old Lucene 9.1 (the version without MemorySegments) in the "misc" module of Lucene as LegacyMMapDirectory. This would allow people to use it, but it is strinly discouraged. It won't work anymore with Java 24 (or around that time).

@uschindler
Copy link
Contributor Author

I reopened #13325.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants