Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: standalone panic with error invalid memory address or nil pointer dereference during test after chaos #34375

Closed
1 task done
zhuwenxing opened this issue Jul 3, 2024 · 4 comments
Assignees
Labels
kind/bug Issues or changes related a bug priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. severity/critical Critical, lead to crash, data missing, wrong result, function totally doesn't work. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Milestone

Comments

@zhuwenxing
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version:master-20240701-87bccb1a-amd64
- Deployment mode(standalone or cluster):standalone
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

[2024/07/02 20:52:50.293 +00:00] [INFO] [compaction/load_stats.go:113] ["Successfully load pk stats"] [segmentID=450875350949038792] [time=12.124248ms] [size=1429504]
[2024/07/02 20:52:50.293 +00:00] [INFO] [metacache/meta_cache.go:289] ["metacache does not have segment, add it"] [segmentID=450875350949038792]
[2024/07/02 20:52:50.293 +00:00] [INFO] [metacache/meta_cache.go:301] ["remove dropped segment"] [segmentID=450874971742697894]
[2024/07/02 20:52:50.293 +00:00] [INFO] [metacache/meta_cache.go:301] ["remove dropped segment"] [segmentID=450874971741697884]
[2024/07/02 20:52:50.293 +00:00] [INFO] [metacache/meta_cache.go:301] ["remove dropped segment"] [segmentID=450874971742898927]
[2024/07/02 20:52:50.293 +00:00] [INFO] [metacache/meta_cache.go:301] ["remove dropped segment"] [segmentID=450875350949438085]
[2024/07/02 20:52:50.293 +00:00] [INFO] [datacoord/session_manager.go:256] ["success to sync segments"] [nodeID=2] [planID=0]
[2024/07/02 20:52:50.293 +00:00] [INFO] [datacoord/sync_segments_scheduler.go:149] ["sync segments success"] [collectionID=450874971741889147] [partitionID=450874971741889148] [channelName=by-dev-rootcoord-dml_10_450874971741889147v1] [nodeID=2] [segments="[450874971741695845,450875350949036470,450875350949443485,450874971741692920,450875350949037202,450874971741692292,450875350949038792,450875350949443372,450874971741694456,450874971741697849]"]
[2024/07/02 20:52:50.293 +00:00] [INFO] [datanode/services.go:283] ["DataNode receives SyncSegments"] [traceID=c0595c782344373541fca702166dd0c2] [planID=0] [nodeID=2] [collectionID=450874971742092756] [partitionID=450874971742092757] [channel=by-dev-rootcoord-dml_13_450874971742092756v0]
[2024/07/02 20:52:50.294 +00:00] [ERROR] [conc/options.go:54] ["Conc pool panicked"] [panic="runtime error: invalid memory address or nil pointer dereference"] [stack="github.com/milvus-io/milvus/pkg/util/conc.(*poolOption).antsOptions.func1\n\t/go/src/github.com/milvus-io/milvus/pkg/util/conc/options.go:54\ngithub.com/panjf2000/ants/v2.(*goWorker).run.func1.1\n\t/go/pkg/mod/github.com/panjf2000/ants/v2@v2.7.2/worker.go:54\nruntime.gopanic\n\t/usr/local/go/src/runtime/panic.go:914\ngithub.com/milvus-io/milvus/pkg/util/conc.(*Pool[...]).Submit.func1.1\n\t/go/src/github.com/milvus-io/milvus/pkg/util/conc/pool.go:74\nruntime.gopanic\n\t/usr/local/go/src/runtime/panic.go:914\nruntime.panicmem\n\t/usr/local/go/src/runtime/panic.go:261\nruntime.sigpanic\n\t/usr/local/go/src/runtime/signal_unix.go:861\ngithub.com/milvus-io/milvus/internal/metastore/kv/binlog.DecompressBinLog\n\t/go/src/github.com/milvus-io/milvus/internal/metastore/kv/binlog/binlog.go:143\ngithub.com/milvus-io/milvus/internal/datanode.(*DataNode).SyncSegments.func1\n\t/go/src/github.com/milvus-io/milvus/internal/datanode/services.go:320\ngithub.com/milvus-io/milvus/pkg/util/conc.(*Pool[...]).Submit.func1\n\t/go/src/github.com/milvus-io/milvus/pkg/util/conc/pool.go:81\ngithub.com/panjf2000/ants/v2.(*goWorker).run.func1\n\t/go/pkg/mod/github.com/panjf2000/ants/v2@v2.7.2/worker.go:67"]
I20240702 20:52:50.295473  3582 InvertedIndexTantivy.cpp:127] [SERVER][Upload][milvus] index file: /var/lib/milvus/data/querynode/index_files/450875350949440007/1/ef7b18e79be24d21a432b0b0f3509630.idx added
I20240702 20:52:50.295585  3582 InvertedIndexTantivy.cpp:123] [SERVER][Upload][milvus] trying to add index file: /var/lib/milvus/data/querynode/index_files/450875350949440007/1/ef7b18e79be24d21a432b0b0f3509630.pos
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x429b494]

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

failed job: https://qa-jenkins.milvus.io/blue/organizations/jenkins/chaos-test-kafka-cron/detail/chaos-test-kafka-cron/15404/pipeline
log:
artifacts-standalone-pod-failure-15404-server-logs.tar.gz

Anything else?

No response

@zhuwenxing zhuwenxing added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 3, 2024
@zhuwenxing zhuwenxing added priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. severity/critical Critical, lead to crash, data missing, wrong result, function totally doesn't work. labels Jul 3, 2024
@zhuwenxing zhuwenxing added this to the 2.5.0 milestone Jul 3, 2024
@yanliang567 yanliang567 added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 3, 2024
@weiliu1031
Copy link
Contributor

should be fixed by #34389

sre-ci-robot pushed a commit that referenced this issue Jul 3, 2024
…eta (#34393)

issue: #34376 ,  #34375,  #34379

master pr: #34390

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
sre-ci-robot pushed a commit that referenced this issue Jul 3, 2024
issue: #34376 , #34379 , #34375

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
@weiliu1031
Copy link
Contributor

please verify this with latest images

@weiliu1031
Copy link
Contributor

/assign @zhuwenxing

@zhuwenxing
Copy link
Contributor Author

verified and fixed in master-20240704-fcafdb6d-amd64

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. severity/critical Critical, lead to crash, data missing, wrong result, function totally doesn't work. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

4 participants