Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade from JDK17 to JDK19+ #47

Closed
slow-J opened this issue Aug 16, 2023 · 10 comments · Fixed by #56
Closed

Upgrade from JDK17 to JDK19+ #47

slow-J opened this issue Aug 16, 2023 · 10 comments · Fixed by #56
Assignees

Comments

@slow-J
Copy link
Collaborator

slow-J commented Aug 16, 2023

As part of having all the the components updated to the newest versions, we should do the same for JDK.

Currently only JDK17 is supported. To upgrade to JDK 19 we need to enable Panama API and build MemorySegmentIndexInputProvider into the JAR.

This is the current error message when running make index with JDK19

--- Indexing Lucene 9.5.0 with %2 deletes ---
java -server -cp build/libs/search-index-benchmark-game-lucene-1.0-SNAPSHOT-all.jar BuildIndex idx 2 < /local/home/jslowins/unison/tantivy-bench/search-benchmark-game/corpus/enwiki-20120502-lines-1k-fixed-utf8-with-random-label.txt
Exception in thread "main" java.lang.LinkageError: MemorySegmentIndexInputProvider is missing in Lucene JAR file
        at org.apache.lucene.store.MMapDirectory.lookupProvider(MMapDirectory.java:437)
        at java.base/java.security.AccessController.doPrivileged(AccessController.java:318)
        at org.apache.lucene.store.MMapDirectory.doPrivileged(MMapDirectory.java:395)
        at org.apache.lucene.store.MMapDirectory.<clinit>(MMapDirectory.java:448)
        at org.apache.lucene.store.FSDirectory.open(FSDirectory.java:161)
        at org.apache.lucene.store.FSDirectory.open(FSDirectory.java:156)
        at BuildIndex.main(BuildIndex.java:33)
Caused by: java.lang.ClassNotFoundException: org.apache.lucene.store.MemorySegmentIndexInputProvider
        at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
        at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
        at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
        at java.base/java.lang.Class.forName0(Native Method)
        at java.base/java.lang.Class.forName(Class.java:495)
        at java.base/java.lang.Class.forName(Class.java:474)
        at java.base/java.lang.invoke.MethodHandles$Lookup.findClass(MethodHandles.java:2786)
        at org.apache.lucene.store.MMapDirectory.lookupProvider(MMapDirectory.java:422)
        ... 6 more
@slow-J
Copy link
Collaborator Author

slow-J commented Aug 17, 2023

Either I have weak gradle knowledge or this is not trivial, here is how the Panama Vector API was integrated into Lucene for reference. https://github.com/apache/lucene/pull/12311/files

@slow-J
Copy link
Collaborator Author

slow-J commented Sep 7, 2023

Still have not figured out the vector API, we can disable this for the moment with -Dorg.apache.lucene.store.MMapDirectory.enableMemorySegments=false, will first run a benchmark to confirm sanity of results.

@slow-J slow-J self-assigned this Sep 7, 2023
@slow-J slow-J changed the title Upgrade from JDK17 to JDK19 Upgrade from JDK17 to JDK19+ Sep 7, 2023
@mikemccand
Copy link
Collaborator

Hmm the vector API uses multi-release JAR somehow. Lucene has specific sources switched depending on the JVM version, but should have compiled all versions (JDK 19, 20) into the Lucene core JAR. And it should have enabled Panama based MMapDirectory by default on JDK 19. Ideally we run this test using Panama to ensure we are indeed testing the latest & greatest Lucene features?

Have we upgraded this benchmark to Lucene 9.7.0?

@slow-J
Copy link
Collaborator Author

slow-J commented Sep 7, 2023

Yes, the benchmark has been upgraded to and is running Lucene 9.7.0

I tried adding --enable-preview --add-modules jdk.incubator.vector to the java Makefile which didn't work.
I now get: WARNING: Using incubator modules: jdk.incubator.vector Exception in thread "main" java.lang.LinkageError: MemorySegmentIndexInputProvider is missing in Lucene JAR file

Although even with enableMemorySegments=false, JDK20 is bringing a -2.26% latency improvement for COUNT. and -0.89% for TOP_10_COUNT. Maybe it is worth disabling for now?

image

@mikemccand
Copy link
Collaborator

Maybe it is worth disabling for now?

+1 -- let's move forward with this upgrade and open a followon issue to get Panama MMAP working again? Crazy it's so hard ... the JDK version specific implementations should be in the Lucene core JAR already, under META-INF/versions/N/.... E.g. this is in my lucene-core-9.8.0-snapshot.jar (build on the current tip of Lucene 9x branch):

 drwxr-xr-x       0  10-Jul-2023 10:19:58  META-INF/services/                                                                                                                               
  -rw-r--r--     855  10-Jul-2023 10:19:58  META-INF/services/org.apache.lucene.analysis.TokenizerFactory                                                                                    
  -rw-r--r--     843  10-Jul-2023 10:19:58  META-INF/services/org.apache.lucene.codecs.Codec                                                                                                 
  -rw-r--r--     853  10-Jul-2023 10:19:58  META-INF/services/org.apache.lucene.codecs.DocValuesFormat                                                                                       
  -rw-r--r--     855  10-Jul-2023 10:19:58  META-INF/services/org.apache.lucene.codecs.KnnVectorsFormat                                                                                      
  -rw-r--r--     852  10-Jul-2023 10:19:58  META-INF/services/org.apache.lucene.codecs.PostingsFormat                                                                                        
  -rw-r--r--     939  10-Jul-2023 10:19:58  META-INF/services/org.apache.lucene.index.SortFieldProvider                                                                                      
  drwxr-xr-x       0  12-Sep-2023 05:59:14  META-INF/versions/                                                                                                                               
  drwxr-xr-x       0  12-Sep-2023 05:59:14  META-INF/versions/19/                                                                                                                            
  drwxr-xr-x       0  10-Jul-2023 10:20:04  META-INF/versions/19/org/                                                                                                                        
  drwxr-xr-x       0  10-Jul-2023 10:20:04  META-INF/versions/19/org/apache/                                                                                                                 
  drwxr-xr-x       0  10-Jul-2023 10:20:04  META-INF/versions/19/org/apache/lucene/                                                                                                          
  drwxr-xr-x       0  10-Jul-2023 10:20:04  META-INF/versions/19/org/apache/lucene/store/                                                                                                    
  -rw-r--r--    3356  10-Jul-2023 10:20:04  META-INF/versions/19/org/apache/lucene/store/MemorySegmentIndexInput$MultiSegmentImpl.class                                                      
  -rw-r--r--    4098  10-Jul-2023 10:20:04  META-INF/versions/19/org/apache/lucene/store/MemorySegmentIndexInput$SingleSegmentImpl.class                                                     
  -rw-r--r--   13423  10-Jul-2023 10:20:04  META-INF/versions/19/org/apache/lucene/store/MemorySegmentIndexInput.class                                                                       
  -rw-r--r--    5219  10-Jul-2023 10:20:04  META-INF/versions/19/org/apache/lucene/store/MemorySegmentIndexInputProvider.class                                                               
  drwxr-xr-x       0  12-Sep-2023 05:59:14  META-INF/versions/20/                                                                                                                            
  drwxr-xr-x       0  10-Jul-2023 10:20:04  META-INF/versions/20/org/                                                                                                                        
  drwxr-xr-x       0  10-Jul-2023 10:20:04  META-INF/versions/20/org/apache/                                                                                                                 
  drwxr-xr-x       0  10-Jul-2023 10:20:04  META-INF/versions/20/org/apache/lucene/                                                                                                          
  drwxr-xr-x       0  10-Jul-2023 10:20:04  META-INF/versions/20/org/apache/lucene/internal/                                                                                                 
  drwxr-xr-x       0  10-Jul-2023 10:20:04  META-INF/versions/20/org/apache/lucene/internal/vectorization/                                                                                   
  -rw-r--r--   11350  10-Jul-2023 10:20:04  META-INF/versions/20/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.class                                                      
  -rw-r--r--    3949  10-Jul-2023 10:20:04  META-INF/versions/20/org/apache/lucene/internal/vectorization/PanamaVectorizationProvider.class                                                  
  drwxr-xr-x       0  10-Jul-2023 10:20:04  META-INF/versions/20/org/apache/lucene/store/                                                                                                    
  -rw-r--r--    3322  10-Jul-2023 10:20:04  META-INF/versions/20/org/apache/lucene/store/MemorySegmentIndexInput$MultiSegmentImpl.class                                                      
  -rw-r--r--    4072  10-Jul-2023 10:20:04  META-INF/versions/20/org/apache/lucene/store/MemorySegmentIndexInput$SingleSegmentImpl.class                                                     
  -rw-r--r--   13197  10-Jul-2023 10:20:04  META-INF/versions/20/org/apache/lucene/store/MemorySegmentIndexInput.class                                                                       
  -rw-r--r--    5236  10-Jul-2023 10:20:04  META-INF/versions/20/org/apache/lucene/store/MemorySegmentIndexInputProvider.class                                                               
  drwxr-xr-x       0  12-Sep-2023 05:59:14  META-INF/versions/21/                                                                                                                            
  drwxr-xr-x       0  10-Jul-2023 10:20:04  META-INF/versions/21/org/                                                                                                                        
  drwxr-xr-x       0  10-Jul-2023 10:20:04  META-INF/versions/21/org/apache/                                                                                                                 
  drwxr-xr-x       0  10-Jul-2023 10:20:04  META-INF/versions/21/org/apache/lucene/                                                                                                          
  drwxr-xr-x       0  10-Jul-2023 10:20:04  META-INF/versions/21/org/apache/lucene/store/                                                                                                    
  -rw-r--r--    3322  10-Jul-2023 10:20:04  META-INF/versions/21/org/apache/lucene/store/MemorySegmentIndexInput$MultiSegmentImpl.class                                                      
  -rw-r--r--    4072  10-Jul-2023 10:20:04  META-INF/versions/21/org/apache/lucene/store/MemorySegmentIndexInput$SingleSegmentImpl.class                                                     
  -rw-r--r--   13262  10-Jul-2023 10:20:04  META-INF/versions/21/org/apache/lucene/store/MemorySegmentIndexInput.class                                                                       
  -rw-r--r--    5167  10-Jul-2023 10:20:04  META-INF/versions/21/org/apache/lucene/store/MemorySegmentIndexInputProvider.class                                                               

@slow-J
Copy link
Collaborator Author

slow-J commented Sep 19, 2023

jar tf search-index-benchmark-game-lucene-1.0-SNAPSHOT-all.jar
does contain

META-INF/versions/
META-INF/versions/19/
META-INF/versions/19/org/
META-INF/versions/19/org/apache/
META-INF/versions/19/org/apache/lucene/
META-INF/versions/19/org/apache/lucene/store/
META-INF/versions/19/org/apache/lucene/store/MemorySegmentIndexInput$MultiSegmentImpl.class
META-INF/versions/19/org/apache/lucene/store/MemorySegmentIndexInput$SingleSegmentImpl.class
META-INF/versions/19/org/apache/lucene/store/MemorySegmentIndexInput.class
META-INF/versions/19/org/apache/lucene/store/MemorySegmentIndexInputProvider.class
META-INF/versions/20/
META-INF/versions/20/org/
META-INF/versions/20/org/apache/
META-INF/versions/20/org/apache/lucene/
META-INF/versions/20/org/apache/lucene/store/
META-INF/versions/20/org/apache/lucene/store/MemorySegmentIndexInput$MultiSegmentImpl.class
META-INF/versions/20/org/apache/lucene/store/MemorySegmentIndexInput$SingleSegmentImpl.class
META-INF/versions/20/org/apache/lucene/store/MemorySegmentIndexInput.class
META-INF/versions/20/org/apache/lucene/store/MemorySegmentIndexInputProvider.class
META-INF/versions/20/org/apache/lucene/util/
META-INF/versions/20/org/apache/lucene/util/VectorUtilPanamaProvider.class
META-INF/versions/21/
META-INF/versions/21/org/
META-INF/versions/21/org/apache/
META-INF/versions/21/org/apache/lucene/
META-INF/versions/21/org/apache/lucene/store/
META-INF/versions/21/org/apache/lucene/store/MemorySegmentIndexInput$MultiSegmentImpl.class
META-INF/versions/21/org/apache/lucene/store/MemorySegmentIndexInput$SingleSegmentImpl.class
META-INF/versions/21/org/apache/lucene/store/MemorySegmentIndexInput.class
META-INF/versions/21/org/apache/lucene/store/MemorySegmentIndexInputProvider.class

I'll dig around some more, for now I'll temporarily disable enableMemorySegments to allow building with JDK19+.

@slow-J
Copy link
Collaborator Author

slow-J commented Sep 19, 2023

#56

@mikemccand
Copy link
Collaborator

I'll dig around some more, for now I'll temporarily disable enableMemorySegments to allow building with JDK19+.

Do we have an issue open to re-enable Panama MMAP in Lucene? I.e. to get to the bottom of the build / CLASSPATH mrjar issues.

@slow-J
Copy link
Collaborator Author

slow-J commented Nov 6, 2023

I'll create an issue now so that we don't forget about this :D

@slow-J
Copy link
Collaborator Author

slow-J commented Nov 6, 2023

#59

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants