Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: DataNode panic: segment not found #33696

Closed
1 task done
ThreadDao opened this issue Jun 6, 2024 · 7 comments
Closed
1 task done

[Bug]: DataNode panic: segment not found #33696

ThreadDao opened this issue Jun 6, 2024 · 7 comments
Assignees
Labels
kind/bug Issues or changes related a bug priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Milestone

Comments

@ThreadDao
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: master-20240605-feeb869f-amd64
- Deployment mode(standalone or cluster):cluster
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

test steps:

  1. create a collection with 2 shards, partition_key (32 partitions)
  2. create index
  3. insert 30m-128d data -> flush
  4. index and load
  5. concurrent requests: search + flush + upsert
    image

dataNode panic:

dn_9jmvj.log
dn_nkhhw.log

4521 [2024/06/06 05:51:30.330 +00:00] [INFO] [datanode/services.go:289] ["sync segments is empty, skip it"] [traceID=7a9e4683126dcb666b43933cca25d779] [planID=0] [nodeID=18] [collectionI     D=450259552513743671] [partitionID=450259552513743694] [channel=compact-master--key-op-28-3921-rootcoord-dml_1_450259552513743671v1]
4522 [2024/06/06 05:51:30.330 +00:00] [INFO] [syncmgr/task.go:210] ["task done"] [collectionID=450259552513743671] [partitionID=450259552513743703] [segmentID=450259552558468018] [channe     l=compact-master--key-op-28-3921-rootcoord-dml_1_450259552513743671v1] [level=L1] [flushedSize=0]
4523 [2024/06/06 05:51:30.330 +00:00] [INFO] [syncmgr/task.go:210] ["task done"] [collectionID=450259552513743671] [partitionID=450259552513743675] [segmentID=450259552558468022] [channe     l=compact-master--key-op-28-3921-rootcoord-dml_1_450259552513743671v1] [level=L1] [flushedSize=0]
4524 [2024/06/06 05:51:30.335 +00:00] [INFO] [syncmgr/sync_manager.go:157] ["sync mgr sumbit task with key"] [key=450259552558468285]
4525 [2024/06/06 05:51:30.336 +00:00] [WARN] [syncmgr/task.go:195] ["failed to save serialized data into storage"] [collectionID=450259552513743671] [partitionID=450259552513743693] [seg     mentID=450259552558468299] [channel=compact-master--key-op-28-3921-rootcoord-dml_1_450259552513743671v1] [level=L1] [error="segment not found[segment=450259552558468299]"]
4526 [2024/06/06 05:51:30.336 +00:00] [ERROR] [conc/options.go:54] ["Conc pool panicked"] [panic="segment not found[segment=450259552558468299]"] [stack="github.com/milvus-io/milvus/pkg/     util/conc.(*poolOption).antsOptions.func1\n\t/go/src/github.com/milvus-io/milvus/pkg/util/conc/options.go:54\ngithub.com/panjf2000/ants/v2.(*goWorker).run.func1.1\n\t/go/pkg/mod/git     hub.com/panjf2000/ants/v2@v2.7.2/worker.go:54\nruntime.gopanic\n\t/usr/local/go/src/runtime/panic.go:914\ngithub.com/milvus-io/milvus/pkg/util/conc.(*Pool[...]).Submit.func1.1\n\t/g     o/src/github.com/milvus-io/milvus/pkg/util/conc/pool.go:74\nruntime.gopanic\n\t/usr/local/go/src/runtime/panic.go:914\ngithub.com/milvus-io/milvus/internal/datanode/syncmgr.(*storag     eV1Serializer).setTaskMeta.func1\n\t/go/src/github.com/milvus-io/milvus/internal/datanode/syncmgr/storage_serializer.go:156\ngithub.com/milvus-io/milvus/internal/datanode/syncmgr.(*     SyncTask).HandleError\n\t/go/src/github.com/milvus-io/milvus/internal/datanode/syncmgr/task.go:117\ngithub.com/milvus-io/milvus/internal/datanode/syncmgr.(*SyncTask).Run.func1\n\t/g     o/src/github.com/milvus-io/milvus/internal/datanode/syncmgr/task.go:132\ngithub.com/milvus-io/milvus/internal/datanode/syncmgr.(*SyncTask).Run\n\t/go/src/github.com/milvus-io/milvus     /internal/datanode/syncmgr/task.go:196\ngithub.com/milvus-io/milvus/internal/datanode/syncmgr.(*keyLockDispatcher[...]).Submit.func1\n\t/go/src/github.com/milvus-io/milvus/internal/     datanode/syncmgr/key_lock_dispatcher.go:37\ngithub.com/milvus-io/milvus/pkg/util/conc.(*Pool[...]).Submit.func1\n\t/go/src/github.com/milvus-io/milvus/pkg/util/conc/pool.go:81\ngith     ub.com/panjf2000/ants/v2.(*goWorker).run.func1\n\t/go/pkg/mod/github.com/panjf2000/ants/v2@v2.7.2/worker.go:67"]
4527 panic: segment not found[segment=450259552558468299] [recovered]
4528     panic: segment not found[segment=450259552558468299] [recovered]
4529     panic: segment not found[segment=450259552558468299]
4530 
4531 goroutine 181328 [running]:
4532 panic({0x55f7220?, 0xc003561b30?})
4533     /usr/local/go/src/runtime/panic.go:1017 +0x3ac fp=0xc000e75558 sp=0xc000e754a8 pc=0x1df2aec
4534 github.com/milvus-io/milvus/pkg/util/conc.(*poolOption).antsOptions.func1({0x55f7220, 0xc003561b30})
4535     /go/src/github.com/milvus-io/milvus/pkg/util/conc/options.go:56 +0x146 fp=0xc000e75620 sp=0xc000e75558 pc=0x3a2e206
4536 github.com/panjf2000/ants/v2.(*goWorker).run.func1.1()
4537     /go/pkg/mod/github.com/panjf2000/ants/v2@v2.7.2/worker.go:54 +0x6d fp=0xc000e75698 sp=0xc000e75620 pc=0x3a2b86d
4538 runtime.deferCallSave(0xc000e75750, 0xc000e75fb8?)
4539     /usr/local/go/src/runtime/panic.go:798 +0x84 fp=0xc000e756a8 sp=0xc000e75698 pc=0x1df26a4
4540 runtime.runOpenDeferFrame(0xc01e2f3900)
4541     /usr/local/go/src/runtime/panic.go:771 +0x1b8 fp=0xc000e756e8 sp=0xc000e756a8 pc=0x1df24d8
4542 panic({0x55f7220?, 0xc003561b30?})
4543     /usr/local/go/src/runtime/panic.go:914 +0x21f fp=0xc000e75798 sp=0xc000e756e8 pc=0x1df295f
4544 github.com/milvus-io/milvus/pkg/util/conc.(*Pool[...]).Submit.func1.1()
4545     /go/src/github.com/milvus-io/milvus/pkg/util/conc/pool.go:74 +0x8d fp=0xc000e757f8 sp=0xc000e75798 pc=0x4d52f8d
4546 runtime.deferCallSave(0xc000e758b0, 0xc000e75f50?)
4547     /usr/local/go/src/runtime/panic.go:798 +0x84 fp=0xc000e75808 sp=0xc000e757f8 pc=0x1df26a4
4548 runtime.runOpenDeferFrame(0xc01c130dc0)
4549     /usr/local/go/src/runtime/panic.go:771 +0x1b8 fp=0xc000e75848 sp=0xc000e75808 pc=0x1df24d8
4550 panic({0x55f7220?, 0xc003561b30?})
4551     /usr/local/go/src/runtime/panic.go:914 +0x21f fp=0xc000e758f8 sp=0xc000e75848 pc=0x1df295f
4552 github.com/milvus-io/milvus/internal/datanode/syncmgr.(*storageV1Serializer).setTaskMeta.func1({0x5eba820?, 0xc003561b30?})

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

compact-master--key-op-28-3921-etcd-0                             1/1     Running       0                 21h     10.104.20.143   4am-node22   <none>           <none>
compact-master--key-op-28-3921-etcd-1                             1/1     Running       0                 21h     10.104.33.237   4am-node36   <none>           <none>
compact-master--key-op-28-3921-etcd-2                             1/1     Running       0                 21h     10.104.32.116   4am-node39   <none>           <none>
compact-master--key-op-28-3921-milvus-datanode-5fd975fdd7-9jmvj   1/1     Running       3 (7h38m ago)     21h     10.104.6.163    4am-node13   <none>           <none>
compact-master--key-op-28-3921-milvus-datanode-5fd975fdd7-nkhhw   1/1     Running       5 (8h ago)        21h     10.104.4.207    4am-node11   <none>           <none>
compact-master--key-op-28-3921-milvus-indexnode-65f885f6868g65f   1/1     Running       0                 21h     10.104.9.211    4am-node14   <none>           <none>
compact-master--key-op-28-3921-milvus-indexnode-65f885f686dwdcg   1/1     Running       0                 21h     10.104.5.153    4am-node12   <none>           <none>
compact-master--key-op-28-3921-milvus-mixcoord-675d78db58-tj6vm   1/1     Running       0                 21h     10.104.14.233   4am-node18   <none>           <none>
compact-master--key-op-28-3921-milvus-proxy-795d4d49cb-lbgdt      1/1     Running       0                 21h     10.104.4.206    4am-node11   <none>           <none>
compact-master--key-op-28-3921-milvus-querynode-0-7956bf4694jrs   1/1     Running       1 (7h46m ago)     21h     10.104.13.191   4am-node16   <none>           <none>
compact-master--key-op-28-3921-milvus-querynode-0-7956bf46dq7lx   1/1     Running       1 (8h ago)        21h     10.104.21.134   4am-node24   <none>           <none>
compact-master--key-op-28-3921-milvus-querynode-0-7956bf46nccjl   1/1     Running       0                 21h     10.104.32.119   4am-node39   <none>           <none>
compact-master--key-op-28-3921-milvus-querynode-0-7956bf46xkg5t   1/1     Running       1 (9h ago)        21h     10.104.16.231   4am-node21   <none>           <none>
compact-master--key-op-28-3921-minio-0                            1/1     Running       0                 21h     10.104.15.247   4am-node20   <none>           <none>
compact-master--key-op-28-3921-minio-1                            1/1     Running       0                 21h     10.104.32.117   4am-node39   <none>           <none>
compact-master--key-op-28-3921-minio-2                            1/1     Running       0                 21h     10.104.33.239   4am-node36   <none>           <none>
compact-master--key-op-28-3921-minio-3                            1/1     Running       0                 21h     10.104.20.144   4am-node22   <none>           <none>
compact-master--key-op-28-3921-pulsar-bookie-0                    1/1     Running       0                 21h     10.104.15.248   4am-node20   <none>           <none>
compact-master--key-op-28-3921-pulsar-bookie-1                    1/1     Running       0                 21h     10.104.33.241   4am-node36   <none>           <none>
compact-master--key-op-28-3921-pulsar-bookie-2                    1/1     Running       0                 21h     10.104.26.208   4am-node32   <none>           <none>
compact-master--key-op-28-3921-pulsar-bookie-init-ll4v5           0/1     Completed     0                 21h     10.104.6.160    4am-node13   <none>           <none>
compact-master--key-op-28-3921-pulsar-broker-0                    1/1     Running       0                 21h     10.104.32.113   4am-node39   <none>           <none>
compact-master--key-op-28-3921-pulsar-proxy-0                     1/1     Running       0                 21h     10.104.6.161    4am-node13   <none>           <none>
compact-master--key-op-28-3921-pulsar-pulsar-init-79lzv           0/1     Completed     0                 21h     10.104.13.185   4am-node16   <none>           <none>
compact-master--key-op-28-3921-pulsar-recovery-0                  1/1     Running       0                 21h     10.104.20.139   4am-node22   <none>           <none>
compact-master--key-op-28-3921-pulsar-zookeeper-0                 1/1     Running       0                 21h     10.104.32.118   4am-node39   <none>           <none>
compact-master--key-op-28-3921-pulsar-zookeeper-1                 1/1     Running       0                 21h     10.104.15.252   4am-node20   <none>           <none>
compact-master--key-op-28-3921-pulsar-zookeeper-2                 1/1     Running       0                 21h     10.104.16.230   4am-node21   <none>           <none>

Anything else?

No response

@ThreadDao ThreadDao added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 6, 2024
@ThreadDao ThreadDao added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Jun 6, 2024
@ThreadDao ThreadDao added this to the 2.5.0 milestone Jun 6, 2024
@ThreadDao
Copy link
Contributor Author

ThreadDao commented Jun 6, 2024

There are also other problems:

  1. No L0 compaction was triggered and L1 MixCompaction is not complete
    image
    image

metrics of compact-master--key-op-28-3921

  1. queryNode oomkilled (maybe because dataNode restarted, a large number of upserts caused a sharp increase in the memory of growing segments)

@xiaofan-luan
Copy link
Contributor

/assign @XuanYang-cn

@yanliang567 yanliang567 added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 7, 2024
@yanliang567 yanliang567 removed their assignment Jun 7, 2024
@ThreadDao
Copy link
Contributor Author

@czs007 @XuanYang-cn

  • image: master-20240612-9ab3058d-amd64
    panic: segment not found
19740 [2024/06/13 06:50:43.939 +00:00] [INFO] [metacache/meta_cache.go:298] ["remove dropped segment"] [segmentID=450428466827972529]
19741 [2024/06/13 06:50:43.939 +00:00] [INFO] [metacache/meta_cache.go:298] ["remove dropped segment"] [segmentID=450428466827972540]
19742 [2024/06/13 06:50:43.941 +00:00] [WARN] [syncmgr/task.go:199] ["failed to save serialized data into storage"] [collectionID=450428466818449735] [partitionID=450428466818449      736] [segmentID=450428466827972540] [channel=compact-master-sert-op-37-7267-rootcoord-dml_1_450428466818449735v1] [level=L1] [error="segment not found[segment=4504284668279      72540]"]
19743 [2024/06/13 06:50:43.941 +00:00] [ERROR] [conc/options.go:54] ["Conc pool panicked"] [panic="segment not found[segment=450428466827972540]"] [stack="github.com/milvus-io/mi      lvus/pkg/util/conc.(*poolOption).antsOptions.func1\n\t/go/src/github.com/milvus-io/milvus/pkg/util/conc/options.go:54\ngithub.com/panjf2000/ants/v2.(*goWorker).run.func1.1\      n\t/go/pkg/mod/github.com/panjf2000/ants/v2@v2.7.2/worker.go:54\nruntime.gopanic\n\t/usr/local/go/src/runtime/panic.go:914\ngithub.com/milvus-io/milvus/pkg/util/conc.(*Pool      [...]).Submit.func1.1\n\t/go/src/github.com/milvus-io/milvus/pkg/util/conc/pool.go:74\nruntime.gopanic\n\t/usr/local/go/src/runtime/panic.go:914\ngithub.com/milvus-io/milvu      s/internal/datanode/syncmgr.(*storageV1Serializer).setTaskMeta.func1\n\t/go/src/github.com/milvus-io/milvus/internal/datanode/syncmgr/storage_serializer.go:156\ngithub.com/      milvus-io/milvus/internal/datanode/syncmgr.(*SyncTask).HandleError\n\t/go/src/github.com/milvus-io/milvus/internal/datanode/syncmgr/task.go:117\ngithub.com/milvus-io/milvus      /internal/datanode/syncmgr.(*SyncTask).Run.func1\n\t/go/src/github.com/milvus-io/milvus/internal/datanode/syncmgr/task.go:132\ngithub.com/milvus-io/milvus/internal/datanode      /syncmgr.(*SyncTask).Run\n\t/go/src/github.com/milvus-io/milvus/internal/datanode/syncmgr/task.go:200\ngithub.com/milvus-io/milvus/internal/datanode/syncmgr.(*keyLockDispat      cher[...]).Submit.func1\n\t/go/src/github.com/milvus-io/milvus/internal/datanode/syncmgr/key_lock_dispatcher.go:37\ngithub.com/milvus-io/milvus/pkg/util/conc.(*Pool[...]).S      ubmit.func1\n\t/go/src/github.com/milvus-io/milvus/pkg/util/conc/pool.go:81\ngithub.com/panjf2000/ants/v2.(*goWorker).run.func1\n\t/go/pkg/mod/github.com/panjf2000/ants/v2@      v2.7.2/worker.go:67"]
19744 panic: segment not found[segment=450428466827972540] [recovered]
19745     panic: segment not found[segment=450428466827972540] [recovered]
19746     panic: segment not found[segment=450428466827972540]
19747 
19748 goroutine 320036 [running]:
19749 panic({0x5684060?, 0xc008fdc840?})
19750     /usr/local/go/src/runtime/panic.go:1017 +0x3ac fp=0xc0142db548 sp=0xc0142db498 pc=0x1e21b2c
19751 github.com/milvus-io/milvus/pkg/util/conc.(*poolOption).antsOptions.func1({0x5684060, 0xc008fdc840})
19752     /go/src/github.com/milvus-io/milvus/pkg/util/conc/options.go:56 +0x146 fp=0xc0142db610 sp=0xc0142db548 pc=0x3a6a4c6
19753 github.com/panjf2000/ants/v2.(*goWorker).run.func1.1()
19754     /go/pkg/mod/github.com/panjf2000/ants/v2@v2.7.2/worker.go:54 +0x6d fp=0xc0142db688 sp=0xc0142db610 pc=0x3a67b2d
19755 runtime.deferCallSave(0xc0142db740, 0xc0142dbfb8?)
19756     /usr/local/go/src/runtime/panic.go:798 +0x84 fp=0xc0142db698 sp=0xc0142db688 pc=0x1e216e4
19757 runtime.runOpenDeferFrame(0xc000d66960)
19758     /usr/local/go/src/runtime/panic.go:771 +0x1b8 fp=0xc0142db6d8 sp=0xc0142db698 pc=0x1e21518
19759 panic({0x5684060?, 0xc008fdc840?})
19760     /usr/local/go/src/runtime/panic.go:914 +0x21f fp=0xc0142db788 sp=0xc0142db6d8 pc=0x1e2199f
19761 github.com/milvus-io/milvus/pkg/util/conc.(*Pool[...]).Submit.func1.1()
19762     /go/src/github.com/milvus-io/milvus/pkg/util/conc/pool.go:74 +0x8d fp=0xc0142db7e8 sp=0xc0142db788 pc=0x4dd122d
19763 runtime.deferCallSave(0xc0142db8a0, 0xc0142dbf50?)
19764     /usr/local/go/src/runtime/panic.go:798 +0x84 fp=0xc0142db7f8 sp=0xc0142db7e8 pc=0x1e216e4
19765 runtime.runOpenDeferFrame(0xc003b203c0)
19766     /usr/local/go/src/runtime/panic.go:771 +0x1b8 fp=0xc0142db838 sp=0xc0142db7f8 pc=0x1e21518

dn_64lfp.log

compact-master-sert-op-37-7267-etcd-0                             1/1     Running     0                27h     10.104.33.17    4am-node36   <none>           <none>
compact-master-sert-op-37-7267-etcd-1                             1/1     Running     0                27h     10.104.16.187   4am-node21   <none>           <none>
compact-master-sert-op-37-7267-etcd-2                             1/1     Running     0                27h     10.104.20.95    4am-node22   <none>           <none>
compact-master-sert-op-37-7267-milvus-datanode-5f44dbd694-64lfp   1/1     Running     2 (23h ago)      27h     10.104.26.156   4am-node32   <none>           <none>
compact-master-sert-op-37-7267-milvus-datanode-5f44dbd694-67wpp   1/1     Running     2 (18h ago)      27h     10.104.16.210   4am-node21   <none>           <none>
compact-master-sert-op-37-7267-milvus-indexnode-fc74f5df7-6hks6   1/1     Running     0                27h     10.104.34.196   4am-node37   <none>           <none>
compact-master-sert-op-37-7267-milvus-indexnode-fc74f5df7-tnjmc   1/1     Running     0                27h     10.104.1.59     4am-node10   <none>           <none>
compact-master-sert-op-37-7267-milvus-mixcoord-cbc7cd64c-xfz8r    1/1     Running     0                27h     10.104.26.155   4am-node32   <none>           <none>
compact-master-sert-op-37-7267-milvus-proxy-79c4df6f79-wzmq5      1/1     Running     0                27h     10.104.16.209   4am-node21   <none>           <none>
compact-master-sert-op-37-7267-milvus-querynode-0-64854f68f8hgx   1/1     Running     0                27h     10.104.18.204   4am-node25   <none>           <none>
compact-master-sert-op-37-7267-milvus-querynode-0-64854f68n5tpv   1/1     Running     0                27h     10.104.26.157   4am-node32   <none>           <none>
compact-master-sert-op-37-7267-milvus-querynode-0-64854f68rcddm   1/1     Running     0                27h     10.104.16.213   4am-node21   <none>           <none>
compact-master-sert-op-37-7267-milvus-querynode-0-64854f68wjqh4   1/1     Running     0                27h     10.104.25.144   4am-node30   <none>           <none>
compact-master-sert-op-37-7267-minio-0                            1/1     Running     0                27h     10.104.33.16    4am-node36   <none>           <none>
compact-master-sert-op-37-7267-minio-1                            1/1     Running     0                27h     10.104.16.193   4am-node21   <none>           <none>
compact-master-sert-op-37-7267-minio-2                            1/1     Running     0                27h     10.104.17.225   4am-node23   <none>           <none>
compact-master-sert-op-37-7267-minio-3                            1/1     Running     0                27h     10.104.20.97    4am-node22   <none>           <none>
compact-master-sert-op-37-7267-pulsar-bookie-0                    1/1     Running     0                27h     10.104.16.197   4am-node21   <none>           <none>
compact-master-sert-op-37-7267-pulsar-bookie-1                    1/1     Running     0                27h     10.104.33.20    4am-node36   <none>           <none>
compact-master-sert-op-37-7267-pulsar-bookie-2                    1/1     Running     0                27h     10.104.17.227   4am-node23   <none>           <none>
compact-master-sert-op-37-7267-pulsar-bookie-init-njm9t           0/1     Completed   0                27h     10.104.4.8      4am-node11   <none>           <none>
compact-master-sert-op-37-7267-pulsar-broker-0                    1/1     Running     0                27h     10.104.4.9      4am-node11   <none>           <none>
compact-master-sert-op-37-7267-pulsar-proxy-0                     1/1     Running     0                27h     10.104.6.66     4am-node13   <none>           <none>
compact-master-sert-op-37-7267-pulsar-pulsar-init-6mcrk           0/1     Completed   0                27h     10.104.6.63     4am-node13   <none>           <none>
compact-master-sert-op-37-7267-pulsar-recovery-0                  1/1     Running     0                27h     10.104.6.64     4am-node13   <none>           <none>
compact-master-sert-op-37-7267-pulsar-zookeeper-0                 1/1     Running     0                27h     10.104.32.177   4am-node39   <none>           <none>
compact-master-sert-op-37-7267-pulsar-zookeeper-1                 1/1     Running     0                27h     10.104.20.101   4am-node22   <none>           <none>
compact-master-sert-op-37-7267-pulsar-zookeeper-2                 1/1     Running     0                27h     10.104.17.229   4am-node23   <none>           <none>

@xiaofan-luan
Copy link
Contributor

@czs007 please take care of it

@czs007
Copy link
Contributor

czs007 commented Jun 25, 2024

working on it

@ThreadDao
Copy link
Contributor Author

@czs007
image: master-20240626-9c2eeff4-amd64
dn_mzf2p.log

[2024/06/28 08:59:56.773 +00:00] [INFO] [metacache/meta_cache.go:299] ["remove dropped segment"] [segmentID=450768351030069674]
[2024/06/28 08:59:56.789 +00:00] [WARN] [syncmgr/task.go:169] ["failed to save serialized data into storage"] [collectionID=450768351016256039] [partitionID=450768351016256040] [segmentID=450768351030476406] [channel=level-master-insert-op-76-5164-rootcoord-dml_0_450768351016256039v0] [level=L1] [error="segment not found[segment=450768351030476406]"]
[2024/06/28 08:59:56.789 +00:00] [ERROR] [conc/options.go:54] ["Conc pool panicked"] [panic="segment not found[segment=450768351030476406]"] [stack="github.com/milvus-io/milvus/pkg/util/conc.(*poolOption).antsOptions.func1\n\t/go/src/github.com/milvus-io/milvus/pkg/util/conc/options.go:54\ngithub.com/panjf2000/ants/v2.(*goWorker).run.func1.1\n\t/go/pkg/mod/github.com/panjf2000/ants/v2@v2.7.2/worker.go:54\nruntime.gopanic\n\t/usr/local/go/src/runtime/panic.go:914\ngithub.com/milvus-io/milvus/pkg/util/conc.(*Pool[...]).Submit.func1.1\n\t/go/src/github.com/milvus-io/milvus/pkg/util/conc/pool.go:74\nruntime.gopanic\n\t/usr/local/go/src/runtime/panic.go:914\ngithub.com/milvus-io/milvus/internal/datanode/syncmgr.(*storageV1Serializer).setTaskMeta.func1\n\t/go/src/github.com/milvus-io/milvus/internal/datanode/syncmgr/storage_serializer.go:158\ngithub.com/milvus-io/milvus/internal/datanode/syncmgr.(*SyncTask).HandleError\n\t/go/src/github.com/milvus-io/milvus/internal/datanode/syncmgr/task.go:107\ngithub.com/milvus-io/milvus/internal/datanode/syncmgr.(*SyncTask).Run.func1\n\t/go/src/github.com/milvus-io/milvus/internal/datanode/syncmgr/task.go:122\ngithub.com/milvus-io/milvus/internal/datanode/syncmgr.(*SyncTask).Run\n\t/go/src/github.com/milvus-io/milvus/internal/datanode/syncmgr/task.go:170\ngithub.com/milvus-io/milvus/internal/datanode/syncmgr.(*keyLockDispatcher[...]).Submit.func1\n\t/go/src/github.com/milvus-io/milvus/internal/datanode/syncmgr/key_lock_dispatcher.go:39\ngithub.com/milvus-io/milvus/pkg/util/conc.(*Pool[...]).Submit.func1\n\t/go/src/github.com/milvus-io/milvus/pkg/util/conc/pool.go:81\ngithub.com/panjf2000/ants/v2.(*goWorker).run.func1\n\t/go/pkg/mod/github.com/panjf2000/ants/v2@v2.7.2/worker.go:67"]
panic: segment not found[segment=450768351030476406] [recovered]
    panic: segment not found[segment=450768351030476406] [recovered]
    panic: segment not found[segment=450768351030476406]

goroutine 24223 [running]:
panic({0x56a1ea0?, 0xc00e598840?})
    /usr/local/go/src/runtime/panic.go:1017 +0x3ac fp=0xc0010795e8 sp=0xc001079538 pc=0x1e2cbac
github.com/milvus-io/milvus/pkg/util/conc.(*poolOption).antsOptions.func1({0x56a1ea0, 0xc00e598840})
    /go/src/github.com/milvus-io/milvus/pkg/util/conc/options.go:56 +0x146 fp=0xc0010796b0 sp=0xc0010795e8 pc=0x3ab64e6
github.com/panjf2000/ants/v2.(*goWorker).run.func1.1()

argo: https://argo-workflows.zilliz.cc/archived-workflows/qa/cd5ae9ad-6282-48f7-abc0-d62306a3f8ff?nodeId=level-zero-stable-master-tntnd-1014026323

sre-ci-robot pushed a commit that referenced this issue Jul 1, 2024
issue: #33696

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
xiaocai2333 added a commit to xiaocai2333/milvus that referenced this issue Jul 2, 2024
yellow-shine pushed a commit to yellow-shine/milvus that referenced this issue Jul 2, 2024
sre-ci-robot pushed a commit that referenced this issue Jul 2, 2024
…34301) (#34318)

issue: #33696

master pr: #34301

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
@ThreadDao
Copy link
Contributor Author

fixed master-20240703-a501fa11-amd64

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

6 participants