Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: channel by-dev-rootcoord-dml_11_443279871761864895v0 is not available in any replica #27355

Closed
1 task done
XDeviation opened this issue Sep 25, 2023 · 7 comments
Closed
1 task done
Assignees
Labels
kind/bug Issues or changes related a bug stale indicates no udpates for 30 days triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@XDeviation
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: 2.2.12
- Deployment mode(standalone or cluster): cluster
- MQ type(rocksmq, pulsar or kafka):    pulsar
- SDK version(e.g. pymilvus v2.0.0rc2): pymilvus 2.3.1
- OS(Ubuntu or CentOS): Ubuntu 20.04
- CPU/Memory: ...
- GPU: None
- Others:

Current Behavior

schema:
{'auto_id': True, 'description': 'Object search collection', 'fields': [{'name': 'pk', 'description': '', 'type': <DataType.INT64: 5>, 'is_primary': True, 'auto_id': True}, {'name': 'vector', 'description': '', 'type': <DataType.FLOAT_VECTOR: 101>, 'params': {'dim': 1280}}], 'enable_dynamic_field': True}
collection.num_entities: 1000000

RPC error: [query], <MilvusException: (code=1, message=fail to query on all shard leaders, err=All attempts results:
attempt #1:All attempts results:
attempt #1:fail to get shard leaders from QueryCoord: channel by-dev-rootcoord-dml_12_443279871761864895v1 is not available in any replica, err=<nil>
attempt #2:fail to get shard leaders from QueryCoord: channel by-dev-rootcoord-dml_12_443279871761864895v1 is not available in any replica, err=<nil>
attempt #3:fail to get shard leaders from QueryCoord: channel by-dev-rootcoord-dml_11_443279871761864895v0 is not available in any replica, err=<nil>
attempt #4:fail to get shard leaders from QueryCoord: channel by-dev-rootcoord-dml_11_443279871761864895v0 is not available in any replica, err=<nil>
attempt #5:fail to get shard leaders from QueryCoord: channel by-dev-rootcoord-dml_11_443279871761864895v0 is not available in any replica, err=<nil>
attempt #6:fail to get shard leaders from QueryCoord: channel by-dev-rootcoord-dml_11_443279871761864895v0 is not available in any replica, err=<nil>
attempt #7:fail to get shard leaders from QueryCoord: channel by-dev-rootcoord-dml_11_443279871761864895v0 is not available in any replica, err=<nil>
attempt #8:context deadline exceeded

attempt #2:context canceled
)>, <Time:{'RPC start': '2023-09-25 05:17:59.793247', 'RPC error': '2023-09-25 05:18:09.799420'}>

Expected Behavior

all test cases passed

Steps To Reproduce

No response

Milvus Log

No response

Anything else?

No response

@XDeviation XDeviation added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Sep 25, 2023
@yanliang567
Copy link
Contributor

@XDeviation please provide the full milvus logs for investigation. refer this doc to export the whole Milvus logs. For Milvus installed with docker-compose, you can use docker-compose logs > milvus.log to export the logs.
Also if possible, please share the code snippet to reproduce the issue.

/assign @XDeviation
/unassign

@yanliang567 yanliang567 added triage/needs-information Indicates an issue needs more information in order to work on it. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Sep 26, 2023
@axiangcoding
Copy link

Same issue.
In my case, I use langchain vectorstore integrate pymlivus to use Milvus. This error occurs with probability, but not frequently.

@xiaofan-luan
Copy link
Contributor

Same issue. In my case, I use langchain vectorstore integrate pymlivus to use Milvus. This error occurs with probability, but not frequently.

could you offer logs for milvus? you can use export-log.sh

@axiangcoding
Copy link

Same issue. In my case, I use langchain vectorstore integrate pymlivus to use Milvus. This error occurs with probability, but not frequently.

could you offer logs for milvus? you can use export-log.sh

It's been a long time, I will export the log next time it happens.

Copy link

stale bot commented Nov 19, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

@stale stale bot added the stale indicates no udpates for 30 days label Nov 19, 2023
@stale stale bot closed this as completed Nov 26, 2023
@ronghuaihai
Copy link

we have face the same question,
image

the log of proxy as follows:
[2023/12/09 08:38:46.775 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="All attempts results:\nattempt #1:fail to get shard leaders from QueryCoord: channel by-dev-rootcoord-dml_4_443999969427532581v1 is not available in any replica, err=NodeOffline(nodeID=-1)\nattempt #2:fail to get shard leaders from QueryCoord: channel by-dev-rootcoord-dml_4_443999969427532581v1 is not available in any replica, err=NodeOffline(nodeID=-1)\nattempt #3:fail to get shard leaders from QueryCoord: channel by-dev-rootcoord-dml_4_443999969427532581v1 is not available in any replica, err=NodeOffline(nodeID=-1)\nattempt #4:fail to get shard leaders from QueryCoord: channel by-dev-rootcoord-dml_4_443999969427532581v1 is not available in any replica, err=NodeOffline(nodeID=-1)\nattempt #5:fail to get shard leaders from QueryCoord: channel by-dev-rootcoord-dml_4_443999969427532581v1 is not available in any replica, err=NodeOffline(nodeID=-1)\nattempt #6:fail to get shard leaders from QueryCoord: channel by-dev-rootcoord-dml_4_443999969427532581v1 is not available in any replica, err=NodeOffline(nodeID=-1)\nattempt #7:fail to get shard leaders from QueryCoord: channel by-dev-rootcoord-dml_4_443999969427532581v1 is not available in any replica, err=NodeOffline(nodeID=-1)\nattempt #8:context deadline exceeded\n"]

how to solve the problem

@xiaofan-luan
Copy link
Contributor

we have face the same question, image

the log of proxy as follows: [2023/12/09 08:38:46.775 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="All attempts results:\nattempt #1:fail to get shard leaders from QueryCoord: channel by-dev-rootcoord-dml_4_443999969427532581v1 is not available in any replica, err=NodeOffline(nodeID=-1)\nattempt #2:fail to get shard leaders from QueryCoord: channel by-dev-rootcoord-dml_4_443999969427532581v1 is not available in any replica, err=NodeOffline(nodeID=-1)\nattempt #3:fail to get shard leaders from QueryCoord: channel by-dev-rootcoord-dml_4_443999969427532581v1 is not available in any replica, err=NodeOffline(nodeID=-1)\nattempt #4:fail to get shard leaders from QueryCoord: channel by-dev-rootcoord-dml_4_443999969427532581v1 is not available in any replica, err=NodeOffline(nodeID=-1)\nattempt #5:fail to get shard leaders from QueryCoord: channel by-dev-rootcoord-dml_4_443999969427532581v1 is not available in any replica, err=NodeOffline(nodeID=-1)\nattempt #6:fail to get shard leaders from QueryCoord: channel by-dev-rootcoord-dml_4_443999969427532581v1 is not available in any replica, err=NodeOffline(nodeID=-1)\nattempt #7:fail to get shard leaders from QueryCoord: channel by-dev-rootcoord-dml_4_443999969427532581v1 is not available in any replica, err=NodeOffline(nodeID=-1)\nattempt #8:context deadline exceeded\n"]

how to solve the problem

please offer full log and open a seperate issue, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug stale indicates no udpates for 30 days triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests

5 participants