Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: The results returned by the count(*) are inaccurate and keep changing #33955

Open
1 task done
ThreadDao opened this issue Jun 18, 2024 · 5 comments
Open
1 task done
Assignees
Labels
kind/bug Issues or changes related a bug priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Milestone

Comments

@ThreadDao
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: 2.4-20240618-79546a6c-amd64
- Deployment mode(standalone or cluster): cluster
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

create milvus with config:

  config:
    dataCoord:
      segment:
        sealProportion: 1.52e-05
    log:
      level: debug
    quotaAndLimits:
      flushRate:
        enabled: true
        max: 0.1 
    trace:
      exporter: jaeger
      jaeger:
        url: http://tempo-distributor.tempo:14268/api/traces

test steps

  1. create collection with 1024 partitions (partition-key), 1 shard
  2. create index
  3. insert 10m-128d data -> flush
  4. index -> load
  5. concurrent requests: search + upsert + flush
    image
  6. expected count(*): 10m, actually:
connections.disconnect("default")
connections.connect(host="10.104.13.204")
utility.list_collections()
['fouram_6xJhw13C']
c = Collection(name='fouram_6xJhw13C')
c.query('id >=0', output_fields=["count(*)"], consistency_level="Strong")
data: ["{'count(*)': 10012932}"] ..., extra_info: {'cost': 0}
c.query('id >=0', output_fields=["count(*)"], consistency_level="Strong")
data: ["{'count(*)': 10012407}"] ..., extra_info: {'cost': 0}
c.name
'fouram_6xJhw13C'
c.query('id >=0', output_fields=["count(*)"], consistency_level="Strong")
data: ["{'count(*)': 10012372}"] ..., extra_info: {'cost': 0}
c.query('id >=0', output_fields=["count(*)"], consistency_level="Strong")
data: ["{'count(*)': 10012294}"] ..., extra_info: {'cost': 0}
c.query('id >=0', output_fields=["count(*)"], consistency_level="Strong")
data: ["{'count(*)': 10012261}"] ..., extra_info: {'cost': 0}

Expected Behavior

No response

Steps To Reproduce

https://argo-workflows.zilliz.cc/archived-workflows/qa/88b56c6a-eb3d-4862-95a6-b0c64434efde?nodeId=compact-opt-1024-with-flush-2

Milvus Log

compact-opt-flush2-milvus-datanode-5898b9d778-sshqx               1/1     Running     0                82m     10.104.5.70     4am-node12   <none>           <none>
compact-opt-flush2-milvus-indexnode-8c577d9d6-9tnms               1/1     Running     0                82m     10.104.17.163   4am-node23   <none>           <none>
compact-opt-flush2-milvus-indexnode-8c577d9d6-9wl8n               1/1     Running     0                82m     10.104.6.58     4am-node13   <none>           <none>
compact-opt-flush2-milvus-indexnode-8c577d9d6-qq9c4               1/1     Running     0                82m     10.104.20.226   4am-node22   <none>           <none>
compact-opt-flush2-milvus-mixcoord-5b9f79b984-zwfn2               1/1     Running     0                82m     10.104.4.88     4am-node11   <none>           <none>
compact-opt-flush2-milvus-proxy-b55c6db47-vnzc2                   1/1     Running     0                82m     10.104.13.204   4am-node16   <none>           <none>
compact-opt-flush2-milvus-querynode-0-786c99d5cc-k4bcz            1/1     Running     0                82m     10.104.18.196   4am-node25   <none>           <none>
compact-opt-flush2-milvus-querynode-0-786c99d5cc-q5znj            1/1     Running     0                82m     10.104.13.205   4am-node16   <none>           <none>

Anything else?

No response

@ThreadDao ThreadDao added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 18, 2024
@ThreadDao ThreadDao added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Jun 18, 2024
@ThreadDao ThreadDao added this to the 2.4.5 milestone Jun 18, 2024
@yanliang567
Copy link
Contributor

/unassign

@yanliang567 yanliang567 added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 19, 2024
@sunby
Copy link
Contributor

sunby commented Jun 19, 2024

I've identified a potential issue. If a segment's DML position is larger than the earlist growing segment's start position, it might not be included in the L0 compaction, resulting in some data not being deleted.
image

@XuanYang-cn
Copy link
Contributor

One more possible case:
#33907 Fixed a bug about deletion been written into wrong partitions, causing some insertdata not getting deleted correctly, so count(*) increased.

The commit id was: e807183

@XuanYang-cn
Copy link
Contributor

I've identified a potential issue. If a segment's DML position is larger than the earlist growing segment's start position, it might not be included in the L0 compaction, resulting in some data not being deleted.

Good Catch, but very rare for there need to be 2 growing segments existing at the same time and the latter one is sealed before the eariler one.

sunby added a commit to sunby/milvus that referenced this issue Jun 19, 2024
issue: milvus-io#33955

Signed-off-by: sunby <sunbingyi1992@gmail.com>
sre-ci-robot pushed a commit that referenced this issue Jun 21, 2024
issue: #33955

Signed-off-by: sunby <sunbingyi1992@gmail.com>
@ThreadDao
Copy link
Contributor Author

@sunby cherry-pick fix pr to 2.4 branch?

@yanliang567 yanliang567 modified the milestones: 2.4.5, 2.4.6 Jun 26, 2024
yellow-shine pushed a commit to yellow-shine/milvus that referenced this issue Jul 2, 2024
czs007 added a commit to czs007/milvus that referenced this issue Jul 2, 2024
issue : milvus-io#32939
issue: milvus-io#33955

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
czs007 added a commit that referenced this issue Jul 3, 2024
This PR cherry-picks the following commits related to data compaction:
- enhance: Refine compaction.
[#33982](#33982)
- fix l0 compaction may miss some sealed segments.
[#33838](#33980)

issue : #32939
#33955

pr : #33982

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

4 participants