Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Future of Elastiknn #349

Closed
alexklibisz opened this issue Mar 14, 2022 · 5 comments
Closed

Future of Elastiknn #349

alexklibisz opened this issue Mar 14, 2022 · 5 comments

Comments

@alexklibisz
Copy link
Owner

alexklibisz commented Mar 14, 2022

My current plan is to begin winding down my contributions and support of Elastiknn. I envision the project continuing in "maintenance mode", and I want to give any users a specific heads-up about this. In many ways, this is how I've approached the project for the past year or so. I'm just spelling it out so that expectations are clear.

The project faces several headwinds, not all bad.

Good headwinds:

  1. Elasticsearch and Opensearch are both investing in ANN implementations. After over a year of development, Lucene and Elasticsearch are finally starting to expose an API for ANN search. There's also a growing ecosystem of ANN search solutions, both open source and proprietary. It's great to see the ecosystem growing.
  2. I have other technical interests I'd like to pursue. Elastiknn has been a great platform for learning and experimenting, but I actually haven't implemented any sort of ANN search professionally since 2017, back when the idea for Elastiknn was born. I'm glad I followed through on implementing this, but, all things considered, my time at this point is better spent on other problems.

Other headwinds:

  1. The project has seen almost no outside contribution. There have been many issues and emails in which I respond, "I don't have time to work on this right this moment, but there's a developer-guide.md and I'm happy to review PRs." I can count on one hand the number of times someone has followed up on this. To be very clear, I have had some help doing maintenance, like upgrading ES versions, for which I'm very appreciative.
  2. It's unclear if anyone is even using this in any consequential way. I see some downloads, but only one person has submitted to the list of users in the readme. For better or worse, I get a lot of satisfaction from knowing my effort translates to solved problems. So the lack of feedback makes it hard to find motivation for this effort compared to some other efforts.

There are a few final tasks I'd like to complete to satisfy my own curiosities:

  1. Upgrade to Elasticsearch 8.x (Upgrade to Elasticsearch 8.x #348). I'll continue reviewing upgrade PRs if other folks make them, but this will be the last major upgrade which I personally do.
  2. Complete and merge a benchmark implementation based on some of the big-ann-benchmarks datasets (Lucene benchmarks for Big-ANN challenge #278). Ideally this would also include an apples/apples comparison to Lucene's HNSW implementation. I'm curious how the numbers play out.
  3. Get rid of the Unsafe vector serialization (Remove Unsafe module for vector serialization #263).

Concretely, once the above are done, here's how I see the future of Elastiknn playing out in "maintenance mode":

  1. I'll continue to review and merge version bump PRs. Based on Elastic's historical cadence, I think getting onto 8.x will lead to relative steady-state for about a year or so.
  2. If someone reports a bug, It'll come through my email and I might comment on how it might be fixed. If it's particularly interesting, and scratches an intellectual itch, I might look into it myself.
  3. I won't pursue big feature additions like Feature Request: Support Multiple Vectors per Field #197, Support range queries (neighbors within some distance) #279, Function score query omits relevant results on large dataset  #298, Support for elastiknn_dense_float_vector in script_score #323. If someone submits a PR, I'll give it a look. I'll keep a high-standard for the project and won't merge low-quality PRs. It would take some impressive ambition to solve these kinds of problems well.
@alexklibisz alexklibisz pinned this issue Mar 14, 2022
@bennimmo
Copy link
Contributor

Really sad to hear this, we're just about to go live with a large implementation of this plugin in our ES cluster. Whilst I can understand the noted negative "headwinds" and appreciate them. Just a note on why I believe your plugin is winning by both allowing for filtering pre vector comparison and also due to a better implementation allowing for aggregations based on a similarity search. Neither of these features is possible on ES8 and open searches implementation is flawed as documented by yourself.

Is this something you are committed to sunsetting?

@bennimmo
Copy link
Contributor

Also to note I really appreciate your final commitments to a "few final tasks" 👍

@alexklibisz
Copy link
Owner Author

Hey @bennimmo , thanks for the kind words, and I'm glad the plugin is working well for you. Please consider adding your company to the users list on the readme.

I updated my original post with a more concrete description of how I see the future of the project looking.

On the filtering and aggregations, I think they could probably just copy my implementation. It uses existing primitives that have been around for over two years. It's in Scala, but easily translatable to Java (w/ a bit more boilerplate). Hopefully the folks at Elastic and Amazon have some grander vision, but also no shame in copying this stuff.

@iandanforth-alation
Copy link

@alexklibisz Alation, Inc is also in process of rollout of Elatiknn. I appreciate all your hard work on this plugin and especially your willingness to be open and clear with your plans. Since we haven't gone live with the related feature I don't want to submit a user's PR, but wanted to make sure you know our appreciation!

@bennimmo
Copy link
Contributor

I've just created a pull request adding us as a user. I thought I'd done this already sorry for the delay, we actually have just gone live with this feature.

@alexklibisz alexklibisz unpinned this issue Jul 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants