-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vector API integration, plan B #12302
Comments
it goes without saying, but i think there are a couple obvious hotspots where we'd want to integrate. I realize containing the complexity may be difficult:
|
Hi @rmuir, About the compilation:
|
To be clear this is what I propose, not making things pluggable. That's not related and not what I want. I want it to just work, similar to mmap. Also we could use the approach for the postings and possibly docvalues decode. |
+1 to this - I would be grateful to you all if you are able to get this working. Don't really want to have to die with a sword in my hand to get to the promised land of vector API. |
I am at a conference this week but i will try to make a prototype, building upon work uwe has done. for example using the apijar technique to prevent having to juggle N jvms on every build/developer machine. This means we can maintain I realize if we do this, it will make the gradle uglier and less elegant. I think theres no way around it, we are re-implementing mr-jar basically. we just have to try to contain the complexity as much as we can... but I really feel this is just something we need to do on behalf of our users. |
Actually when I did this i made this very self contained. It is just a loop over java versions and a modified "jar" task coping gradle outputs to a different prefix dir in the JAR and setting the manifest bit. When we add vectors into the game we may need a bit more separation with build dirs / source sets. We currently have "main19", "main20", "main21", but maybe the names should in future include the type (panama, vector), so we can compile separately. I am ready to help you, just share when you have a branch. |
This does not look too complex: https://github.com/apache/lucene/blob/main/gradle/java/memorysegment-mrjar.gradle |
I agree with the high-level idea - use the Incubating Vector API internally in Lucene, in a way that is opt-in and also constrained by module/private-SPI/loading techniques (so as to reduce the static usage to only a small defined set of code). We would very much like to use this in Elasticsearch.
Same. Related, but not directly proposed. Unfortunately (or not), the incubating Vector API must be loaded as part of the boot layer - it must be added by the command line |
Correction. The incubating Vector API must be loaded as part of the boot layer - it must be added by the command line --add-modules flag OR explicitly required by the consuming module. Loading into a separate loader is not possible because of the qualified exports from java.base. |
I like this too, especially the part that it basically auto-wires based on whether you enable the corresponding jvm feature on cmd line. |
😆
YES!!! This is timely and the heart of my comment on the proposal to increase KNN dimensionality (is it time we explore a new optional Lucene Vector module that supports cutting edge JDK features through gradle tooling for optimizing the vector use case?).
💯 agree @uschindler low friction to enable preview is exactly the mechanism I was thinking about in my post. I'm not a gradle expert (ping @markrmiller ) but I think gradle's java toolchain allows for this type of scenario? We did it for enabling preview features on OpenSearch, as did Elasticsearch so why can't we use this path in the upstream lucene project? Thank you for raising this proposal @rmuir
One quick question, though. What does this look like in practice? Do we create separate classes in |
My idea was just about compiling. As said before you have to enable it via command line. |
The compilation problem is already solved. No need for toolchains. We tried that already, it's a mess especially because it does not support ea releases. Read the code referred above for memory mapping. Works identical for vectors and incubator. We compile against stubs. |
Is there room for a helping hand here? Tests, benchmarks, continuous integration, documentation, anything ? |
yes, as nothing has been done yet :) I personally plan to start a branch if nobody beats me to it, but i'm stuck in a conference and without bandwidth this week. You can get an idea of what it takes to add support for every JDK release here using Uwe's mmap setup: #12294 It is for a preview api, but i think most of the logic applies. maybe the only difference here being at "read time" where we may have to load the class differently and use methodhandle guard (like the crappy prototype i posted in the description). But we can maintain .java files and not use base64! :) There are also some relevant special jenkins build jobs and stuff and yeah, we should be testing against EA releases if possible to stay current, just like Uwe does for mmap. |
To better understand the existing "non-final JDK API stub" mechanism, I quickly put together the small set of changes that we need to get started - that generates the Vector API stubs, #12311. We can use this, or not, but it's just a start. We'll need same for JDK 21 EA, but that is trivial, and needed for Foreign too. @uschindler The mechanism looks good. As you said, it's compile-time only. If we felt the need, we could choose to move the implementations (in org.apache.lucene.store) to a non-exported package, but that is not strictly needed - I'm just pondering if any runtime restrictions would be beneficial. |
Next I might look at refactoring the internals of VectorUtil, and have a different implementation for JDK 20, using a similar guard mechanism as is done for mmap. |
I updated #12311, with a basic |
BTW, and not even remotely suggested - I'm not trying to take over this work, just sketch out a few concrete things to help get it started. Whatever collaboration mechanism is typically used when developing such changes for Lucene should also be used here. How should we best work together on this? where will the code/branch live? |
thank you for getting it started: it is great. Lets just iterate forwards with your branch? personally ive never had an issue collaborating on a fork like this (such as ChrisHegarty:panama_vector). In the worst case anyone can send PRs to you to make changes. I think if you are a committer to lucene you can probably push directly to the branch without PR. you can always create a |
By default all Lucene committers can commit to your branch. This is enabled by default and you had to agree when creating the PR. |
Ok, cool. Committers please commit directly. Consider the branch in my personal fork as our shared place for collaboration. Let me know if you encounter any issues. |
I'm not a commiter. Should I fork your fork and do PRs on it? And is this the branch we should base my work on? I also think that PRs are good for collaboration. But if someone can maintain a small, workable bounty list somewhere, I think that'd be great! Maybe we can do a voice call to gather and break down all the ideas ? Or you can submit your ideas here and I can try to maintain the bounty list myself in a Github Kanban/Project on my own fork....
So, it seems like official support for preview/incubating/experimental/jvm-sensitive features will go back to two LTSes ago? Therefore, when Java 21 is released, you will maintain experimental extensions available since Java 17. Is this correct? If not, when are you planning to drop support for 19 and 20 ? And what will be the policy in the future? |
I don't think there is any official plan or anything here. Sorry I don't really have any answers. I dont think there is any rule on which jdk versions are supported. For mmap, the ones supported are the ones that @uschindler supports, he is shouldering all of the maintenance work. Personally I think, for this one as a start, it is an incubating module and so it is acceptable to only support the very latest jdk release... If someone wants to use the bleeding edge then they must use the bleeding edge. Let's start minimal and try to reduce our maintenance costs. |
It is the correct branch. I would just send @ChrisHegarty a pull request if u have contributions. I am a committee but Might do the same thing myself if I have some time to hack on it soon. Pull requests are nice since we can review each other's changes. |
Sorry for the chaotic typos/answers, on a phone. |
From what I can see, the classes/files modified so far are: Would it make sense for me to port and maintain these changes to either/or Java 17, Java 18, Java 19 .. and Java 21 based on @ChrisHegarty's original Java 20 code? Also, should I start a panama-specific CI stack on my Github account with unit tests and benchmarks (I could write these too). Whatever you guys think is the priority... |
I don't understand what's your intention is. The general setup is there. No infrastructure work needed anymore. The decision which variants of the code life in parallel is unrelated to this PR. Please keep in mind that not all code of Lucene can/may use those APIs at the moment, because vector specific apis can't be part of Lucene public API. So the pattern for other places would be similar to this one. Implement all algorithms with a simple API in the provider and call it from VectorUtil. More complex integrations like postings lists may need more thoughts. Some of them may only be embedded in Lucene 11 (like directly multiplying vectors residing in off-heap memory segments need API changes using not yet public apis). The amount of work for keeping support for various java versions depends on the number of implementations in the provider and how complex the glue code to lucene is. The current vector dot product is so simple, you could duplicate the code to support various versions easily within minutes. But for now let's start with java 20. The vector code in java 17 is outdated and has several performance traps. Memorymapping using segments was added in 19 to Lucene. So don't let's go before that point in time. As Robert said: in contrast to mmap which is a preview API and FULLY supported by Lucene, the incubator code requires users to add command line flags and the apis are still in flux, I would only keep rather recent versions. We do not need to make any guarantee to which versions are supported. If a specific version falls out of our support matrix, the code will fall back to the implementation that's already there. You can be sure that we will make no official guarantees. It may change with every Lucene release. It will not break for anybody, just get slower, so no support guarantees are needed. We are an open source project, not a commercial project! |
++ to all of what @uschindler and @rmuir said relating to which JDK versions we add support for. Specifically, let's start with JDK 20 only. After which we can prepare for, and test with, JDK 21 EA. |
Guys, relax. My intention is to HELP. I was trying to find something that can be done in parallel to what @ChrisHegarty is doing so I don't step on his toes. If you want help, let me know. And be specific about it. |
For what it's worth, I did an alternative impl of vector utils using nd4j to compute cosine similarity... And shockingly, it was not any faster. I am very much in favor of having interfaces here so we can experiment with native implementations... |
Description
For years we have explored using the vector api to actually take advantage of SIMD units on the hardware. A couple of approaches have been attempted so far:
Currently, unless the user is extremely technical, they can't make use of the vector support their hardware has, which is terribly sad. If they have immense resources/funding/etc, they can fork lucene and patch the source code, and maintain a fork, hooking in incubator openjdk stuff, but that's too hard on users.
I think we have to draw a line in the sand, basically we can not rely upon openjdk to be managed as a performant project, their decisions make no sense, we have to play a little less nicer and apply some hacks! otherwise give up and switch to a different programming language with better perf!
So I'd suggest to look at the work @uschindler has done with mmap and the preview apis, and let's carve out a path where we use the vector api IFF the user opts in via the command-line.
Proposal (depends entirely upon user's jdk version and supported flags):
Actually the system @uschindler developed I think is the correct design for this, the only trick is that the incubating api is more difficult than the preview api. So we need more build system support, it could require more stuff to be downloaded or build to be slower. But I think its the right decision?
We don't want to have base64-encoded hackiness that is hard to maintain, at the same time, we need to give the users option to opt-in to actually making use their hardware. I think we should suffer the complexity to make this easy on them. It fucking sucks that openjdk makes this almost impossible, but we need to do it for our users. That's what being a library is all about.
The text was updated successfully, but these errors were encountered: