Skip to content

[Bug]: Vector query returns diffrent result after the ivf index is deleted and recreated #2567

@yelongfei908

Description

@yelongfei908

Is there an existing issue for the same bug?

  • I have checked the existing issues.

Version or Commit ID

v 0.5.2

Other environment information

Actual behavior and How to reproduce it

  1. created a table
    curl --location --request POST 'http://localhost:23820/databases/testdb/tables/test1'
    --header 'Content-Type: application/json'
    --data-raw '{
    "createOption": "ignore_if_exists",
    "fields": [
    {
    "name": "pk",
    "type": "INT64"
    },
    {
    "name": "vector",
    "type": "VECTOR,1024,FLOAT"
    },
    {
    "name": "title",
    "type": "VARCHAR"
    },
    {
    "name": "content",
    "type": "VARCHAR"
    },
    {
    "name": "doc_id",
    "type": "INT64"
    },
    {
    "name": "status",
    "type": "INT8"
    },
    {
    "name": "dataset",
    "type": "VARCHAR"
    }
    ]
    }'

  2. create index
    curl --location --request POST 'http://localhost:23820/databases/testdb/tables/test1/indexes/vector-idx'
    --header 'Accept: application/json'
    --header 'Content-Type: application/json'
    --data-raw '{
    "fields": [
    "vector"
    ],
    "index": {
    "type": "ivf",
    "metric": "consine",
    "storage_type": "scalar_quantization",
    "scalar_quantization_bits": "8"

    },
    "create_option": "ignore_if_exists"
    }'

  3. insert some data

  4. query data
    {
    "output": [
    "pk",
    "_similarity"
    ],
    "search": [
    {
    "match_method": "dense",
    "metric_type": "cosine",
    "element_type": "float",
    "fields": "vector",
    "topn": 20,
    "query_vector": [...]
    }
    ]
    }

  5. save the results to test1.txt

  6. drop the index
    curl --location --request DELETE 'http://localhost:23820/databases/testdb/tables/test1/indexes/vector-idx'
    --header 'Accept: application/json'
    --header 'Content-Type: application/json; utf-8'
    --data-raw '{
    "drop_option": "ignore_if_not_exists"
    }'

  7. query data using the command the same as step 4

  8. save the result to test2.txt

  9. recreate index using the command as step 2

  10. query data using the command the same as step 4

  11. save the result to test3.txt

  12. verify test1.txt/test2.txt/test3.txt to find they are differnt, including the returned items, the order and sore of items

Expected behavior

For the unchanged dataset and query command, the query result should be the same, even if the index is dropped or recreated multiple times

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions