Initial rewrite of MMapDirectory for JDK-16 preview (incubating) Panama APIs (>= JDK-16-ea-b32) #2176

uschindler · 2021-01-02T14:42:30Z

This is just a draft PR for a first insight on memory mapping improvements in JDK 16+.

Some background information: Starting with JDK-14, there is a new incubating module "jdk.incubator.foreign" that has a new, not yet stable API for accessing off-heap memory (and later it will also support calling functions using classical MethodHandles that are located in libraries like .so or .dll files). This incubator module has several versions:

first version: https://openjdk.java.net/jeps/370 (slow, very buggy and thread confinement, so making it unuseable with Lucene)
second version: https://openjdk.java.net/jeps/383 (still thread confinement, but now allows transfer of "ownership" to other threads; this is still impossible to use with Lucene.
third version in JDK 16: https://openjdk.java.net/jeps/393 (this version has included "Support for shared segments"). This now allows us to safely use the same external mmaped memory from different threads and also unmap it!

This module more or less overcomes several problems:

ByteBuffer API is limited to 32bit (in fact MMapDirectory has to chunk in 1 GiB portions)
There is no official way to unmap ByteBuffers when the file is no longer used. There is a way to use sun.misc.Unsafe and forcefully unmap segments, but any IndexInput accessing the file from another thread will crush the JVM with SIGSEGV or SIGBUS. We learned to live with that and we happily apply the unsafe unmapping, but that's the main issue.

@uschindler had many discussions with the team at OpenJDK and finally with the third incubator, we have an API that works with Lucene. It was very fruitful discussions (thanks to @mcimadamore !)

With the third incubator we are now finally able to do some tests (especially performance). As this is an incubating module, this PR first changes a bit the build system:

disable -Werror for :lucene:core
add the incubating module to compiler of :lucene:core and enable it for all test builds. This is important, as you have to pass --add-modules jdk.incubator.foreign also at runtime!

The code basically just modifies MMapDirectory to use LONG instead of INT for the chunk size parameter. In addition it adds MemorySegmentIndexInput that is a copy of our ByteBufferIndexInput (still there, but unused), but using MemorySegment instead of ByteBuffer behind the scenes. It works in exactly the same way, just the try/catch blocks for supporting EOFException or moving to another segment were rewritten.

The openInput code uses MemorySegment.mapFile() to get a memory mapping. This method is unfortunately a bit buggy in JDK-16-ea-b30, so I added some workarounds. See JDK issues: https://bugs.openjdk.java.net/browse/JDK-8259027, https://bugs.openjdk.java.net/browse/JDK-8259028, https://bugs.openjdk.java.net/browse/JDK-8259032, https://bugs.openjdk.java.net/browse/JDK-8259034. The bugs with alignment and zero byte mmaps are fixed in b32, this PR was adapted (hacks removed).

It passes all tests and it looks like you can use it to read indexes. The default chunk size is now 16 GiB (but you can raise or lower it as you like; tests are doing this). Of course you can set it to Long.MAX_VALUE, in that case every index file is always mapped to one big memory mapping. My testing with Windows 10 have shown, that this is not a good idea!!!. Huge mappings fragment address space over time and as we can only use like 43 or 46 bits (depending on OS), the fragmentation will at some point kill you. So 16 GiB looks like a good compromise: Most files will be smaller than 6 GiB anyways (unless you optimize your index to one huge segment). So for most Lucene installations, the number of segments will equal the number of open files, so Elasticsearch huge user consumers will be very happy. The sysctl max_map_count may not need to be touched anymore.

In addition, this implements readLELongs in a better way than @jpountz did (no caching or arbitrary objects). Nevertheless, as the new MemorySegment API relies on final, unmodifiable classes and coping memory from a MemorySegment to a on-heap Java array, it requires us to wrap all those arrays using a MemorySegment each time (e.g. in readBytes() or readLELongs), there may be some overhead du to short living object allocations (those are NOT reuseable!!!). In short: In future we should throw away on coping/loading our stuff to heap and maybe throw away IndexInput completely and base our code fully on random access. The new foreign-vector APIs will in future also be written with MemorySegment in its focus. So you can allocate a vector view on a MemorySegment and let the vectorizer fully work outside java heap inside our mmapped files! :-)

It would be good if you could checkout this branch and try it in production.

But be aware:

You need JDK 11 to run Gradle (set JAVA_HOME to it)
You need JDK 16-ea-b32 (set RUNTIME_JAVA_HOME to it)
The lucene-core.jar will be JDK16 class files and requires JDK-16 to execute.
Also you need to add --add-modules jdk.incubator.foreign to the command line of your Java program/Solr server/Elasticsearch server

It would be good to get some benchmarks, especially by @rmuir or @mikemccand. Take your time and enjoy the complexity of setting this up! ;-)

My plan is the following:

report any bugs or slowness, especially with Hotspot optimizations. The last time I talked to Maurizio, he taked about Hotspot not being able to fully optimize for-loops with long instead of int, so it may take some time until the full performance is there.
wait until the final version of project PANAMA-foreign goes into Java's Core Library (no module needed anymore)
add a MR-JAR for lucene-core.jar and compile the MemorySegmentIndexInput and maybe some helper classes with JDK 17/18/19 (hopefully?).

In addition there are some comments in the code talking about safety (e.g., we need IOUtils.close() taking AutoCloseable instead of just Closeable, so we can also enfoce that all memory segments are closed after usage. In addition, by default all VarHandles are aligned. By default it refuses to read a LONG from an address which is not a multiple of 8. I had to disable this feature, as all our index files are heavily unaliged. We should in meantime not only convert our files to little endian, but also make all non-compressed types (like long[] arrays or non-encoded integers be aligned to the correct boundaries in files). The most horrible thing I have seen is that our CFS file format starts the "inner" files totally unaligned. We should fix the CFSWriter to start new files always at multiples of 8 bytes. I will open an issue about this.

jpountz · 2021-01-02T15:59:30Z

In future we should throw away on coping/loading our stuff to heap and maybe throw away IndexInput completely and base our code fully on random access. The new foreign-vector APIs will in future also be written with MemorySegment in its focus. So you can allocate a vector view on a MemorySegment and let the vectorizer fully work outside java heap inside our mmapped files! :-)

+1 Things work like they do today because auto-vectorization only works on on-heap arrays, and this readLELongs method was the fastest way to copy data from a directory to a long[]. @msokolov and I were just discussing a few days ago how this vector API might help drop the copy entirely in the future, which would be great.

msokolov · 2021-01-02T16:11:34Z

Very exciting! Thank you for leading the way here, Uwe.

The most horrible thing I have seen is that our CFS file format starts the "inner" files totally unaligned. We should fix the CFSWriter to start new files always at multiples of 8 bytes. I will open an issue about this.

I ran into this just yesterday as I was playing around with aligning vectors in their index files to see if any perf bump could be gained (the header seems to be 50 bytes usually, so some padding is needed). And then when I asserted alignment, realized that CFS was messing with it.

uschindler · 2021-01-02T16:14:31Z

Very exciting! Thank you for leading the way here, Uwe.

The most horrible thing I have seen is that our CFS file format starts the "inner" files totally unaligned. We should fix the CFSWriter to start new files always at multiples of 8 bytes. I will open an issue about this.

I ran into this just yesterday as I was playing around with aligning vectors in their index files to see if any perf bump could be gained (the header seems to be 50 bytes usually, so some padding is needed). And then when I asserted alignment, realized that CFS was messing with it.

In fact for CFS file we don't even need to change the file format / version number. We just have to make sure that CFS writer starts new inner files aligned to 8 bytes. That should be easy to implement.

uschindler · 2021-01-02T17:20:04Z

@dweiss I need your help here. There is one thing that drives me crazy: with the changes (actually adding --add-modules jdk.incubator.foreign to the test command line), JUnit 4 suddely fails for many "non-tests" with exceptions like this, although those should never have been executed. It's mostly some test helper classes (like the writer for Lucene60Points).

This seems to have to do with some automatism by Gradle or by JUnit once it detects the "Java module system", and I have no idea how to turn it off! What I am completely wondering about: why does it load those classes at all? I was trying to find the good old "Ant fileset" that has this **/*Test.class,**/Test*.class pattern, but this is no longer there with gradle. How to re-add it, if JUnit/Gradle seems to think that it needs some (broken) module system API.

Unless this is fixed, I can't run all tests easily. I only ran them per module and ignored the tons of stack traces. PLEASE HELP!

org.apache.lucene.backward_codecs.lucene60.Lucene60PointsWriter > initializationError FAILED
    java.lang.IllegalArgumentException: Test class can only have one constructor
        at org.junit.runners.model.TestClass.<init>(TestClass.java:48)
        at org.junit.runners.JUnit4.<init>(JUnit4.java:23)
        at org.junit.internal.builders.JUnit4Builder.runnerForClass(JUnit4Builder.java:10)
        at org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:70)
        at org.junit.internal.builders.AllDefaultPossibilitiesBuilder.runnerForClass(AllDefaultPossibilitiesBuilder.java:37)
        at org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:70)
        at org.junit.internal.requests.ClassRequest.createRunner(ClassRequest.java:28)
        at org.junit.internal.requests.MemoizingRequest.getRunner(MemoizingRequest.java:19)
        at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:78)
        at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
        at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
        at org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62)
        at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:78)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:567)
        at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
        at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
        at org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:33)
        at org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:94)
        at jdk.proxy1/jdk.proxy1.$Proxy2.processTestClass(Unknown Source)
        at org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:119)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:78)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:567)
        at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
        at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
        at org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:182)
        at org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:164)
        at org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:414)
        at org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:64)
        at org.gradle.internal.concurrent.ManagedExecutorImpl$1.run(ManagedExecutorImpl.java:48)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
        at org.gradle.internal.concurrent.ThreadFactoryImpl$ManagedThreadRunnable.run(ThreadFactoryImpl.java:56)
        at java.base/java.lang.Thread.run(Thread.java:831)

…d from ANT build)

uschindler · 2021-01-02T18:20:21Z

@dweiss I found a workaround by adding the include/excludes as we use in Ant build. Why were those removed? Should I maybe reopen a new issue to add them back in master?

I have the feeling thois comes from the fact that class files compiled with JDK16 and using the new class file format can't be "analyzed" by gradle, so it assumes "no idea, let's run it because its a class file". The automatisms don't seem to work then. With the include/exclude patterns everything went back to normal.

Should we also add those lines to defaults-tests.gradle in master branch? I don't trust those autodetection, especially it may load read class files that it does not need to look at all?:

      include '**/*Test.class', '**/Test*.class'
      exclude '**/*$*'

uschindler · 2021-01-02T18:24:15Z

FYI, after I was able to run tests in a more convenient way, I figured out that some tests fail from time to time because of using HandleTrackingFS. I disabled all custom filesystems in the PR, but it looks like some tests use HandleTrackingFS.

I forgot to mention this in my description above: MMapDirectory no longer works with custom java.nio.filesystem.FileSystem, as it tries to cast the FileChannel returned by our custom filesystem to the JDK-internal class (see JDK issue: https://bugs.openjdk.java.net/browse/JDK-8259028). For now we have to disable all custom file systems in our test suite until this bug is fixed.

…s occur! Remove useless slicing if aligned.

uschindler · 2021-01-03T00:20:11Z

I fixed the rmeinaing TODOs regarding a safe close of all segments, when exceptions on map() occur. When closing the master IndexInput, we also make sure to unmap all segments, although exceptions might occur (e.g. on concurrent access, close() may fail with IllegalStateException). Those exceptions are bubbled up.

As MemorySegment does not implement Closeable but the more generic AutoCloseable, I used IOUtils.applyToAll() with MemorySegment::close as method reference to the close method (heavy functional interface adaption, ey?)

…ning "buffer" to "segment"; also make the segments array final (curSegment == null when closed)

uschindler · 2021-01-03T17:32:28Z

After some cleanup, I also added a workaround for https://bugs.openjdk.java.net/browse/JDK-8259028: In MMapDirectory we try to unwrap the path using reflection. This is for sure a hackydihickhack workaround, but works until this is fixed -- and now all tests pass for me! :-)

msokolov · 2021-01-03T20:33:13Z

Hi Uwe - just trying to get this patch working here; when I try to compile (gradlew lucene:core:jar) w/JDK 11 I get this: Could not target platform: 'Java SE 16' using tool chain: 'JDK 11 (11)'. So I tried w/JDK16 (hmm might not be the right version?) and get java.lang.IllegalAccessError: class org.gradle.internal.compiler.java.ClassNameCollector (in unnamed module @0x772989a6) cannot access class com.sun.tools.javac.code.Symbol$TypeSymbol (in module jdk.compiler) because module jdk.compiler does not export com.sun.tools.javac.code to unnamed module @0x772989a6. Yeah I have 16-ea+30-2130 - i think that was the version you specified?

msokolov · 2021-01-03T20:37:40Z

Ah, OK once I set RUNTIME_JAVA_HOME I was able to compile OK

uschindler · 2021-01-03T20:40:23Z

Hi read the instructions above, at "be aware". Gradle is not compatible to JDK 16 at all. So Gradle needs to run with JDK 11. You can pass the JDK 16 directory using an environment variable or through sysprop. Gradlew shows all options when you run it without any options and read instructions about alternative jvms.

dweiss · 2021-01-03T20:42:15Z

See this, Mike:
https://github.com/apache/lucene-solr/blob/master/help/jvms.txt

lucene/core/src/java/org/apache/lucene/store/MemorySegmentIndexInput.java

uschindler · 2021-01-05T00:18:31Z

This would also explain slowdowns on short posting-list iterations, as readLELongs() with smaller number of longs is not working well.

My plan would be to do an if statement at beginning of readBytes() or readLELongs() like:

if (length < SOME_LIMIT) {
  for (int i = offset; i< offset + length; i++) {
    bytes[i] = readByte();
  }
} else {
  ... current code with targetSlice ...
}

msokolov · 2021-01-05T13:13:01Z

Uwe, I didn't think that IndexInput would expose its internal ByteBuffers easily? But you are right about the double-copying - that is why I opened https://issues.apache.org/jira/browse/LUCENE-9652. BTW I did test that change with and without this one and saw a similar slowdown there. Possibly a loop with a single float at a time would be better? It's highly counterintutitive to me, but when I get a moment, I will try it.

lucene/core/src/java/org/apache/lucene/store/MemorySegmentIndexInput.java

uschindler · 2021-01-06T23:33:27Z

Hi @msokolov: I added the readLEFloats() from #2175 here.

…eException: Cannot close while another thread is accessing the segment"

…ng objects to extend their functionality (like asserting in tests)

uschindler · 2021-01-07T00:09:45Z

FYI, I added some interface in oal.util.Unwrapable that can be implemented by our mocking layers in test-framework. The new interface allows us to unwrap the Path without knowing about test-framework internals. IMHO, we should use this also for other mocking layers, so we can easy unwrap them with some consistent API.

The remaining 2 issues in MappedByteBuffer.mapFile() were merged today and as soon as those are part of another preview build of JDK 16, we can remove the mapFileBugfix() method.

…eap segments don't need this)

uschindler · 2021-01-07T01:05:46Z

Hi @msokolov: If you have so time, you may check performance again. I rewrote the getBytes() and getLEXxxx() methods a bit.

msokolov · 2021-01-11T19:34:39Z

@uschindler I pulled latest from this branch (ba61072) and re-ran (comparing to updated master (6711eb7) and see similar results:

                    TaskQPS baseline      StdDevQPS candidate      StdDev                Pct diff p-value
   BrowseMonthTaxoFacets        2.15      (7.0%)        1.20      (5.6%)  -44.2% ( -53% -  -33%) 0.000
BrowseDayOfYearTaxoFacets        2.04      (7.1%)        1.16      (5.8%)  -42.9% ( -52% -  -32%) 0.000
    BrowseDateTaxoFacets        2.04      (7.1%)        1.17      (5.8%)  -42.8% ( -51% -  -32%) 0.000
                 LowTerm     1099.09      (4.8%)      936.16      (3.2%)  -14.8% ( -21% -   -7%) 0.000
            OrNotHighLow      523.44      (7.0%)      450.50      (3.8%)  -13.9% ( -23% -   -3%) 0.000
                 Respell       42.23      (2.2%)       36.71      (2.5%)  -13.1% ( -17% -   -8%) 0.000
        AndHighMedVector      893.77      (2.9%)      786.53      (3.0%)  -12.0% ( -17% -   -6%) 0.000
              AndHighLow      662.65      (2.9%)      584.68      (2.0%)  -11.8% ( -16% -   -7%) 0.000
                PKLookup      136.02      (1.5%)      120.03      (1.5%)  -11.8% ( -14% -   -8%) 0.000
                  Fuzzy1       53.41      (6.4%)       47.18      (5.4%)  -11.7% ( -22% -    0%) 0.000
           MedTermVector     1067.80      (3.2%)      946.10      (2.3%)  -11.4% ( -16% -   -6%) 0.000
        AndHighLowVector      941.02      (3.8%)      834.30      (3.1%)  -11.3% ( -17% -   -4%) 0.000
       AndHighHighVector      908.27      (2.9%)      808.05      (2.1%)  -11.0% ( -15% -   -6%) 0.000
           LowTermVector     1010.16      (2.2%)      899.95      (1.9%)  -10.9% ( -14% -   -6%) 0.000
                 MedTerm     1167.78      (3.9%)     1043.79      (3.1%)  -10.6% ( -17% -   -3%) 0.000
   BrowseMonthSSDVFacets       11.99     (13.4%)       10.73      (7.4%)  -10.5% ( -27% -   11%) 0.002
              AndHighMed      155.89      (3.5%)      139.90      (2.8%)  -10.3% ( -16% -   -4%) 0.000
            OrNotHighMed      464.05      (4.1%)      416.91      (4.1%)  -10.2% ( -17% -   -2%) 0.000
          HighTermVector     1222.14      (3.5%)     1105.22      (2.2%)   -9.6% ( -14% -   -4%) 0.000
                Wildcard       91.66      (2.2%)       83.55      (2.4%)   -8.9% ( -13% -   -4%) 0.000
         LowSloppyPhrase       46.92      (2.1%)       42.77      (1.5%)   -8.8% ( -12% -   -5%) 0.000
             LowSpanNear       87.56      (2.7%)       80.02      (2.0%)   -8.6% ( -13% -   -3%) 0.000
            OrHighNotMed      470.32      (5.4%)      429.87      (3.3%)   -8.6% ( -16% -    0%) 0.000
                HighTerm     1269.54      (7.1%)     1163.40      (5.8%)   -8.4% ( -19% -    4%) 0.000
           OrHighNotHigh      430.03      (4.1%)      395.13      (4.3%)   -8.1% ( -15% -    0%) 0.000
            OrHighNotLow      585.18      (5.0%)      537.96      (5.5%)   -8.1% ( -17% -    2%) 0.000
             AndHighHigh       37.46      (3.2%)       34.45      (2.6%)   -8.0% ( -13% -   -2%) 0.000
               OrHighLow      576.99      (7.4%)      533.08      (6.4%)   -7.6% ( -19% -    6%) 0.001
              HighPhrase      261.45      (2.1%)      241.99      (1.8%)   -7.4% ( -11% -   -3%) 0.000
               LowPhrase      375.57      (3.3%)      347.95      (3.7%)   -7.4% ( -13% -    0%) 0.000
            HighSpanNear       20.85      (2.8%)       19.36      (2.0%)   -7.2% ( -11% -   -2%) 0.000
               MedPhrase       22.28      (1.8%)       20.77      (1.2%)   -6.8% (  -9% -   -3%) 0.000
         MedSloppyPhrase       15.96      (3.0%)       14.92      (2.5%)   -6.5% ( -11% -    0%) 0.000
                 Prefix3      204.87      (2.0%)      193.03      (2.2%)   -5.8% (  -9% -   -1%) 0.000
           OrNotHighHigh      496.09      (6.6%)      467.47      (4.1%)   -5.8% ( -15% -    5%) 0.001
        HighSloppyPhrase       22.22      (4.0%)       21.02      (3.0%)   -5.4% ( -11% -    1%) 0.000
       HighTermMonthSort      132.66     (11.4%)      126.18     (13.2%)   -4.9% ( -26% -   22%) 0.211
              TermDTSort       68.55     (13.3%)       65.22     (15.1%)   -4.9% ( -29% -   27%) 0.279
               OrHighMed       78.75      (3.8%)       74.95      (2.6%)   -4.8% ( -10% -    1%) 0.000
             MedSpanNear       29.19      (2.3%)       27.83      (1.9%)   -4.7% (  -8% -    0%) 0.000
   HighTermDayOfYearSort      105.52     (12.0%)      101.55     (10.1%)   -3.8% ( -23% -   20%) 0.284
                  IntNRQ       76.00     (12.3%)       73.29     (11.9%)   -3.6% ( -24% -   23%) 0.351
    HighIntervalsOrdered       21.82      (1.5%)       21.05      (1.5%)   -3.5% (  -6% -    0%) 0.000
    HighTermTitleBDVSort      114.39     (16.4%)      110.77     (14.9%)   -3.2% ( -29% -   33%) 0.522
              OrHighHigh       16.17      (3.5%)       15.68      (2.8%)   -3.1% (  -9% -    3%) 0.002
BrowseDayOfYearSSDVFacets        9.57      (9.0%)        9.29      (5.2%)   -2.9% ( -15% -   12%) 0.221
                  Fuzzy2       30.01      (8.5%)       30.26     (10.2%)    0.8% ( -16% -   21%) 0.781

… length mappings and offsets

uschindler · 2021-01-15T19:25:32Z

I removed the hacks for the bugs in JDK. The minimum requirement to test this draft implementation is JDK-16-ea-b32.

Thanks @msokolov for the performance tests. I was hoping that it gets a bit better, but we may need to figure out where the speed differences are coming from. I don't think results will change in the new preview build, I just removed the hacks.

… can correctly throw AlreadyClosedEx; TODO: add a test

uschindler · 2021-06-07T12:14:58Z

I will close this and would like to move discussion over to apache/lucene#173

Initial state of new jdk-foreign MMAP API

190a853

uschindler added enhancement optimization labels Jan 2, 2021

uschindler requested review from dweiss, jpountz and rmuir January 2, 2021 14:42

uschindler self-assigned this Jan 2, 2021

uschindler marked this pull request as draft January 2, 2021 14:43

uschindler requested review from mikemccand and msokolov January 2, 2021 14:47

Workaround to prevent incorrect test files from being executed (copie…

00d01a7

…d from ANT build)

Fix the remaining TODOs: make sure we unmap all segments if exception…

22c3c4b

…s occur! Remove useless slicing if aligned.

uschindler added 4 commits January 3, 2021 17:27

Cleanup code duplication mess exception handling and rename all remai…

f9ca335

…ning "buffer" to "segment"; also make the segments array final (curSegment == null when closed)

add missing ensureOpen() as NPE can't happen here

1a8a354

Cleanup messy duplicate methods

27fce4f

Add workaround for JDK-8259028

efcfccc

msokolov reviewed Jan 3, 2021

View reviewed changes

lucene/core/src/java/org/apache/lucene/store/MemorySegmentIndexInput.java Outdated Show resolved Hide resolved

uschindler force-pushed the draft/jdk-foreign-mmap branch from 570a31b to c56851c Compare January 3, 2021 22:25

mcimadamore reviewed Jan 5, 2021

View reviewed changes

lucene/core/src/java/org/apache/lucene/store/MemorySegmentIndexInput.java Show resolved Hide resolved

uschindler force-pushed the draft/jdk-foreign-mmap branch from 8988de4 to 50d9300 Compare January 6, 2021 23:10

uschindler added 2 commits January 7, 2021 00:18

Merge branch 'master' into draft/jdk-foreign-mmap

8dd5d90

Add readLEFloats() introduced by LUCENE-9652 / apache#2175

0245d3f

uschindler added 2 commits January 7, 2021 00:44

Improve test to allow the following exception: "java.lang.IllegalStat…

ea188c1

…eException: Cannot close while another thread is accessing the segment"

Add a new interface to Lucene's core to mark classes which are wrappi…

01aca07

…ng objects to extend their functionality (like asserting in tests)

Split and rewrite getBytes() and remove useless try-with-resources (h…

60200e8

…eap segments don't need this)

uschindler force-pushed the draft/jdk-foreign-mmap branch from 5336b6b to 60200e8 Compare January 7, 2021 01:02

Add static final boolean IS_LITTLE_ENDIAN and cleanup if statements

ba61072

uschindler added 2 commits January 15, 2021 19:49

Merge branch 'master' into draft/jdk-foreign-mmap

ec304ab

Remove hacks: JDK-16 EA b32 has now fixed the horrible bugs with zero…

5edcdf4

… length mappings and offsets

uschindler changed the title ~~Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs~~ Initial rewrite of MMapDirectory for JDK-16 preview (incubating) Panama APIs (>= JDK-16-ea-b32) Jan 15, 2021

uschindler added 2 commits January 16, 2021 00:17

Merge branch 'master' into draft/jdk-foreign-mmap

b0eec7a

Improve close method to also null out the segments, so positional API…

7a3cf53

… can correctly throw AlreadyClosedEx; TODO: add a test

zacharymorn mentioned this pull request Jan 16, 2021

LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory #2052

Merged

7 tasks

Merge branch 'master' into draft/jdk-foreign-mmap

d2c0be5

rmuir mentioned this pull request Mar 15, 2021

LUCENE-9838: simd version of VectorUtil.dotProduct apache/lucene#18

Closed

uschindler mentioned this pull request May 3, 2021

LUCENE-9047: Move the Directory APIs to be little endian (take 2) apache/lucene#107

Merged

uschindler mentioned this pull request Jun 7, 2021

Initial rewrite of MMapDirectory for JDK-16 preview (incubating) Panama APIs (>= JDK-16-ea-b32) apache/lucene#173

Closed

uschindler closed this Jun 7, 2021

uschindler deleted the draft/jdk-foreign-mmap branch June 7, 2021 12:15

asfimport mentioned this pull request Mar 20, 2021

simd version of VectorUtil.dotProduct [LUCENE-9838] apache/lucene#10877

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial rewrite of MMapDirectory for JDK-16 preview (incubating) Panama APIs (>= JDK-16-ea-b32) #2176

Initial rewrite of MMapDirectory for JDK-16 preview (incubating) Panama APIs (>= JDK-16-ea-b32) #2176

uschindler commented Jan 2, 2021 •

edited

jpountz commented Jan 2, 2021

msokolov commented Jan 2, 2021

uschindler commented Jan 2, 2021 •

edited

uschindler commented Jan 2, 2021 •

edited

uschindler commented Jan 2, 2021 •

edited

uschindler commented Jan 2, 2021 •

edited

uschindler commented Jan 3, 2021

uschindler commented Jan 3, 2021

msokolov commented Jan 3, 2021

msokolov commented Jan 3, 2021

uschindler commented Jan 3, 2021 •

edited

dweiss commented Jan 3, 2021

uschindler commented Jan 5, 2021 •

edited

msokolov commented Jan 5, 2021

uschindler commented Jan 6, 2021

uschindler commented Jan 7, 2021

uschindler commented Jan 7, 2021

msokolov commented Jan 11, 2021

uschindler commented Jan 15, 2021

uschindler commented Jun 7, 2021

Initial rewrite of MMapDirectory for JDK-16 preview (incubating) Panama APIs (>= JDK-16-ea-b32) #2176

Initial rewrite of MMapDirectory for JDK-16 preview (incubating) Panama APIs (>= JDK-16-ea-b32) #2176

Conversation

uschindler commented Jan 2, 2021 • edited

jpountz commented Jan 2, 2021

msokolov commented Jan 2, 2021

uschindler commented Jan 2, 2021 • edited

uschindler commented Jan 2, 2021 • edited

uschindler commented Jan 2, 2021 • edited

uschindler commented Jan 2, 2021 • edited

uschindler commented Jan 3, 2021

uschindler commented Jan 3, 2021

msokolov commented Jan 3, 2021

msokolov commented Jan 3, 2021

uschindler commented Jan 3, 2021 • edited

dweiss commented Jan 3, 2021

uschindler commented Jan 5, 2021 • edited

msokolov commented Jan 5, 2021

uschindler commented Jan 6, 2021

uschindler commented Jan 7, 2021

uschindler commented Jan 7, 2021

msokolov commented Jan 11, 2021

uschindler commented Jan 15, 2021

uschindler commented Jun 7, 2021

uschindler commented Jan 2, 2021 •

edited

uschindler commented Jan 2, 2021 •

edited

uschindler commented Jan 2, 2021 •

edited

uschindler commented Jan 2, 2021 •

edited

uschindler commented Jan 2, 2021 •

edited

uschindler commented Jan 3, 2021 •

edited

uschindler commented Jan 5, 2021 •

edited