Skip to content

Conversation

@zhiqiang-hhhh
Copy link
Contributor

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

zhiqiang-hhhh and others added 22 commits April 1, 2025 15:53
rebase apache#49703 on master
rm diskann src code (impl is kept as reference)
fix BE comple
fix fmt

NOTE: compilation of FE still has error.

---------

Co-authored-by: chenlinzhong <490103404@qq.com>
```
CREATE TABLE `vector_table` (
  `siteid` int(11) NULL DEFAULT "10" COMMENT "",
  `embedding` array<float>  NOT NULL  COMMENT "",
  `comment` text NULL,
  INDEX idx_test_ann (`embedding`) USING ANN PROPERTIES(
    "index_type"="hnsw",
    "metric_type"="l2",
    "dim"="8",
    "max_degree"="100") COMMENT 'test diskann index',
  INDEX idx_comment (`comment`) USING INVERTED PROPERTIES("support_phrase" = "true", "parser" = "english", "lower_case" = "true") COMMENT 'inverted index for comment' )
  ENGINE=OLAP duplicate KEY(`siteid`) COMMENT "OLAP" DISTRIBUTED BY HASH(`siteid`) BUCKETS 1 PROPERTIES ( "replication_num" = "1" );

INSERT INTO `vector_table` (`siteid`, `embedding`,`comment`) VALUES
(10, [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0,20],"emb1"),
(20, [7.0, 6.0, 5.0, 4.0, 3.0, 2.0, 1.0,30],"emb2")
--------------

Query OK, 2 rows affected (0.07 sec)
{'label':'label_858347013b14baf_b9db5d59b5e30322', 'status':'VISIBLE', 'txnId':'18029'}
```

```
I20250401 19:18:17.977408 3765348 faiss_vector_index.cpp:86] Faiss index saved to faiss.idx, rows 2
```
…g. (apache#49780)

### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
This is the first version with relatively stable behavior:

* Pass the TPC-H test without crashes.

* Perform range search without crashes.

* Perform ANN Top-N search without crashes.

* Perform compound search (range + Top-N) without crashes.

Some unit tests are unstable; they may be related to the Faiss library.

```text
[  FAILED  ] 4 tests, listed below:
[  FAILED  ] VectorSearchTest.AnnTopNDescriptorEvaluateTopN
[  FAILED  ] VectorSearchTest.CompRangeSearch
[  FAILED  ] VectorSearchTest.RangeSearchNoSelector1
[  FAILED  ] VectorSearchTest.RangeSearchWithSelector1
```

tpch test has some unstable failure:
```text
q13	Error: Failed to execute query q13 (cold run). Output:
ERROR 1105 (HY000) at line 18: errCode = 2, detailMessage = (10.16.10.2)[INTERNAL_ERROR]Parameters start = 0, length = 4064, are out of bound in ColumnVector<T>::insert_range_from method (data.size() = 0).
```
@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants