Skip to content

[vector search] Step forward on stability and functionality #51213

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 27, 2025

Conversation

zhiqiang-hhhh
Copy link
Contributor

@zhiqiang-hhhh zhiqiang-hhhh commented May 24, 2025

A huge step forward on stability and functionality.

Functionality

  1. Search parameters like ef_search, can be passed to index as session variables. This behavior is same with pg-vector and duckdb vector search plug-in.
  2. Correct processing for order by desc. Fallback to brute force search when it is necessary.
  3. Support using inner product as index metric and order by inner_product.
  4. When metrics of sql dismatches with index, fallback to brute force.

Stability

  1. More unit test
  2. Virtual column iterator.
  3. According to custom script, result of range search, topn search & compound search is almost same with native faiss. The overlap rate of result is more than 90%. The 10% difference is introduced by batch insert mode of native faiss.

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@zhiqiang-hhhh zhiqiang-hhhh changed the title Stable 1.0 [vector search] Step forward on stability and functionality May 24, 2025
@zhiqiang-hhhh zhiqiang-hhhh marked this pull request as ready for review May 24, 2025 11:22
@yiguolei yiguolei merged commit 1b671c9 into apache:vector-index-dev May 27, 2025
3 checks passed
@zhiqiang-hhhh zhiqiang-hhhh deleted the stable-1.1 branch May 27, 2025 06:31
zhiqiang-hhhh added a commit to zhiqiang-hhhh/doris that referenced this pull request Jun 9, 2025
…1213)

A huge step forward on stability and functionality.

1. Search parameters like `ef_search`, can be passed to index as session
variables. This behavior is same with pg-vector and duckdb vector search
plug-in.
2. Correct processing for order by desc. Fallback to brute force search
when it is necessary.
3. Support using inner product as index metric and order by
inner_product.
4. When metrics of sql dismatches with index, fallback to brute force.

1. More unit test
2. Virtual column iterator.
3. According to custom script, result of range search, topn search &
compound search is almost same with native faiss. The overlap rate of
result is more than 90%. The 10% difference is introduced by batch
insert mode of native faiss.
zhiqiang-hhhh added a commit to zhiqiang-hhhh/doris that referenced this pull request Jun 12, 2025
…1213)

A huge step forward on stability and functionality.

1. Search parameters like `ef_search`, can be passed to index as session
variables. This behavior is same with pg-vector and duckdb vector search
plug-in.
2. Correct processing for order by desc. Fallback to brute force search
when it is necessary.
3. Support using inner product as index metric and order by
inner_product.
4. When metrics of sql dismatches with index, fallback to brute force.

1. More unit test
2. Virtual column iterator.
3. According to custom script, result of range search, topn search &
compound search is almost same with native faiss. The overlap rate of
result is more than 90%. The 10% difference is introduced by batch
insert mode of native faiss.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants