Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: milvus 搜索永久阻塞(milvus Search permanent blocking) #22070

Closed
1 task done
ponponon opened this issue Feb 8, 2023 · 20 comments
Closed
1 task done

[Bug]: milvus 搜索永久阻塞(milvus Search permanent blocking) #22070

ponponon opened this issue Feb 8, 2023 · 20 comments
Assignees
Labels
kind/bug Issues or changes related a bug stale indicates no udpates for 30 days triage/accepted Indicates an issue or PR is ready to be actively worked on.
Milestone

Comments

@ponponon
Copy link

ponponon commented Feb 8, 2023

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version:2.2.2
- Deployment mode(standalone or cluster):standalone
- MQ type(rocksmq, pulsar or kafka): None   
- SDK version(e.g. pymilvus v2.0.0rc2): pymilvus2.2.1
- OS(Ubuntu or CentOS): docker container
- CPU/Memory: 16core 32GB
- GPU: None
- Others: None

Current Behavior

milvus 出现无法搜索了怎么办?现象是 CPU 100% 十几秒之后回落到很低,然后所有的 search 请求都卡住了,永远没有结果,milvus 的 cpu 也是超低,看起来没有在干活,是触发到什么 bug 了吗?

What should I do if milvus appears to be unable to search? The phenomenon is that the CPU is 100% and then drops back to very low after a dozen seconds, then all search requests are stuck and never get any results, the milvus cpu is also super low and doesn't seem to be working, is it triggered by some bug?

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

] [INFO] [indexcoord/index_coord.go:588] ["IndexCoord completeIndexInfo"] [collID=439314234651381188] [indexName=index_hash_code]
[2023/02/08 10:29:49.708 +00:00] [DEBUG] [indexcoord/meta_table.go:651] ["IndexCoord get index states success"] [indexID=439314234651381196] [total=0] [None=0] [Unissued=0] [InProgress=0] [Finished=0] [Failed=0]
[2023/02/08 10:29:49.708 +00:00] [DEBUG] [indexcoord/meta_table.go:728] ["IndexCoord get index states success"] [indexID=439314234651381196] [indexRows=26934]
[2023/02/08 10:29:49.709 +00:00] [INFO] [indexcoord/index_coord.go:621] ["IndexCoord completeIndexInfo success"] [collID=439314234651381188] [totalRows=26934] [indexRows=26934] [state=Finished] [failReason=]
[2023/02/08 10:29:49.709 +00:00] [INFO] [indexcoord/index_coord.go:588] ["IndexCoord completeIndexInfo"] [collID=439314234651381188] [indexName=index_vector]
[2023/02/08 10:29:49.709 +00:00] [DEBUG] [indexcoord/meta_table.go:651] ["IndexCoord get index states success"] [indexID=439314234651381194] [total=0] [None=0] [Unissued=0] [InProgress=0] [Finished=0] [Failed=0]
[2023/02/08 10:29:49.709 +00:00] [DEBUG] [indexcoord/meta_table.go:728] ["IndexCoord get index states success"] [indexID=439314234651381194] [indexRows=26934]
[2023/02/08 10:29:49.709 +00:00] [INFO] [indexcoord/index_coord.go:621] ["IndexCoord completeIndexInfo success"] [collID=439314234651381188] [totalRows=26934] [indexRows=26934] [state=Finished] [failReason=]
[2023/02/08 10:29:49.711 +00:00] [DEBUG] [indexcoord/index_coord.go:772] ["IndexCoord GetIndexInfos"] [collectionID=439314234651381188] [indexName=] [segIDs="[439314234651381218]"]
[2023/02/08 10:29:49.712 +00:00] [INFO] [indexcoord/index_coord.go:829] ["IndexCoord DescribeIndex"] [collectionID=439314234651381188] [indexName=]
[2023/02/08 10:29:49.712 +00:00] [INFO] [datacoord/services.go:730] ["received get flushed segments request"] [collectionID=439314234651381188] [partitionID=-1]
[2023/02/08 10:29:49.714 +00:00] [INFO] [indexcoord/index_coord.go:588] ["IndexCoord completeIndexInfo"] [collID=439314234651381188] [indexName=index_vector]
[2023/02/08 10:29:49.714 +00:00] [DEBUG] [indexcoord/meta_table.go:651] ["IndexCoord get index states success"] [indexID=439314234651381194] [total=0] [None=0] [Unissued=0] [InProgress=0] [Finished=0] [Failed=0]
[2023/02/08 10:29:49.714 +00:00] [DEBUG] [indexcoord/meta_table.go:728] ["IndexCoord get index states success"] [indexID=439314234651381194] [indexRows=26934]
[2023/02/08 10:29:49.714 +00:00] [INFO] [indexcoord/index_coord.go:621] ["IndexCoord completeIndexInfo success"] [collID=439314234651381188] [totalRows=26934] [indexRows=26934] [state=Finished] [failReason=]
[2023/02/08 10:29:49.714 +00:00] [INFO] [indexcoord/index_coord.go:588] ["IndexCoord completeIndexInfo"] [collID=439314234651381188] [indexName=index_hash_code]
[2023/02/08 10:29:49.714 +00:00] [DEBUG] [indexcoord/meta_table.go:651] ["IndexCoord get index states success"] [indexID=439314234651381196] [total=0] [None=0] [Unissued=0] [InProgress=0] [Finished=0] [Failed=0]
[2023/02/08 10:29:49.714 +00:00] [DEBUG] [indexcoord/meta_table.go:728] ["IndexCoord get index states success"] [indexID=439314234651381196] [indexRows=26934]
[2023/02/08 10:29:49.714 +00:00] [INFO] [indexcoord/index_coord.go:621] ["IndexCoord completeIndexInfo success"] [collID=439314234651381188] [totalRows=26934] [indexRows=26934] [state=Finished] [failReason=]
[2023/02/08 10:29:49.800 +00:00] [INFO] [rootcoord/import_manager.go:164] ["start trying to flip task state"]
[2023/02/08 10:29:49.800 +00:00] [INFO] [rootcoord/import_manager.go:780] ["import manager starts loading from Etcd"]
[2023/02/08 10:29:49.802 +00:00] [DEBUG] [querynode/flow_graph_service_time_node.go:70] ["update tSafe:"] [collectionID=439314234651381188] [tSafe=439314596403609601] [tSafe_p=2023/02/08 10:29:49.650 +00:00] [tsLag=152.318323ms] [channel=by-dev-rootcoord-dml_3_439314234651381188v1]
[2023/02/08 10:29:49.802 +00:00] [DEBUG] [querycoordv2/services.go:551] ["get metrics request received"] [msgID=0] [metricType="{\"metric_type\":\"system_info\"}"]
[2023/02/08 10:29:49.819 +00:00] [DEBUG] [proxy/impl.go:3796] [Proxy.GetProxyMetrics] [node_id=8] [req="{\"metric_type\":\"system_info\"}"] [metric_type=system_info]
[2023/02/08 10:29:49.820 +00:00] [DEBUG] [datacoord/services.go:877] [DataCoord.GetMetrics] [node_id=3] [req="{\"metric_type\":\"system_info\"}"] [metric_type=system_info] [metrics="status:<> response:\"{\\\"cluster\\\":{\\\"self\\\":{\\\"has_error\\\":false,\\\"error_reason\\\":\\\"\\\",\\\"name\\\":\\\"datacoord3\\\",\\\"hardware_infos\\\":{\\\"ip\\\":\\\"172.21.29.5:13333\\\",\\\"cpu_core_count\\\":16,\\\"cpu_core_usage\\\":0,\\\"memory\\\":33670107136,\\\"memory_usage\\\":727552000,\\\"disk\\\":104857600,\\\"disk_usage\\\":2097152},\\\"system_info\\\":{\\\"system_version\\\":\\\"0fdc1a04\\\",\\\"deploy_mode\\\":\\\"STANDALONE\\\",\\\"build_version\\\":\\\"v2.2.2\\\",\\\"build_time\\\":\\\"Thu Dec 22 09:08:37 UTC 2022\\\",\\\"used_go_version\\\":\\\"go version go1.18.3 linux/amd64\\\"},\\\"created_time\\\":\\\"2023-02-08 10:06:49.703174194 +0000 UTC m=+0.633691516\\\",\\\"updated_time\\\":\\\"2023-02-08 10:06:49.703174604 +0000 UTC m=+0.633691926\\\",\\\"type\\\":\\\"datacoord\\\",\\\"id\\\":3,\\\"system_configurations\\\":{\\\"segment_max_size\\\":512},\\\"quota_metrics\\\":{\\\"TotalBinlogSize\\\":55857468}},\\\"connected_nodes\\\":[{\\\"has_error\\\":false,\\\"error_reason\\\":\\\"\\\",\\\"name\\\":\\\"datanode6\\\",\\\"hardware_infos\\\":{\\\"ip\\\":\\\"172.21.29.5:21124\\\",\\\"cpu_core_count\\\":16,\\\"cpu_core_usage\\\":33.33333540293884,\\\"memory\\\":33670107136,\\\"memory_usage\\\":727552000,\\\"disk\\\":104857600,\\\"disk_usage\\\":2097152},\\\"system_info\\\":{\\\"system_version\\\":\\\"0fdc1a04\\\",\\\"deploy_mode\\\":\\\"STANDALONE\\\",\\\"build_version\\\":\\\"v2.2.2\\\",\\\"build_time\\\":\\\"Thu Dec 22 09:08:37 UTC 2022\\\",\\\"used_go_version\\\":\\\"go version go1.18.3 linux/amd64\\\"},\\\"created_time\\\":\\\"2023-02-08 10:06:49.90154382 +0000 UTC m=+0.832061122\\\",\\\"updated_time\\\":\\\"2023-02-08 10:06:49.90154405 +0000 UTC m=+0.832061375\\\",\\\"type\\\":\\\"datanode\\\",\\\"id\\\":6,\\\"system_configurations\\\":{\\\"flush_insert_buffer_size\\\":16777216},\\\"quota_metrics\\\":{\\\"Hms\\\":{\\\"ip\\\":\\\"172.21.29.5:21124\\\",\\\"cpu_core_count\\\":16,\\\"cpu_core_usage\\\":33.33333540293884,\\\"memory\\\":33670107136,\\\"memory_usage\\\":727552000,\\\"disk\\\":104857600,\\\"disk_usage\\\":2097152},\\\"Rms\\\":[{\\\"Label\\\":\\\"InsertConsumeThroughput\\\",\\\"Rate\\\":0},{\\\"Label\\\":\\\"DeleteConsumeThroughput\\\",\\\"Rate\\\":0}],\\\"Fgm\\\":{\\\"MinFlowGraphChannel\\\":\\\"by-dev-rootcoord-dml_2_439314234651381188v0\\\",\\\"MinFlowGraphTt\\\":439314596403609601,\\\"NumFlowGraph\\\":2}}}]},\\\"connections\\\":{\\\"name\\\":\\\"datacoord3\\\",\\\"connected_components\\\":[]}}\" component_name:\"datacoord3\" "] []
[2023/02/08 10:29:49.821 +00:00] [INFO] [proxy/multi_rate_limiter.go:96] ["RateLimiter setRates"] [rates="[{\"rt\":8,\"r\":1.7976931348623157e+308},{\"rt\":9,\"r\":1.7976931348623157e+308},{\"rt\":7,\"r\":1.7976931348623157e+308},{\"rt\":5,\"r\":1.7976931348623157e+308},{\"rt\":6,\"r\":1.7976931348623157e+308}]"]
[2023/02/08 10:29:49.954 +00:00] [INFO] [datacoord/services.go:730] ["received get flushed segments request"] [collectionID=439314234651381188] [partitionID=-1]
[2023/02/08 10:29:52.801 +00:00] [DEBUG] [querycoordv2/services.go:551] ["get metrics request received"] [msgID=0] [metricType="{\"metric_type\":\"system_info\"}"]
[2023/02/08 10:29:52.805 +00:00] [DEBUG] [proxy/impl.go:3796] [Proxy.GetProxyMetrics] [node_id=8] [req="{\"metric_type\":\"system_info\"}"] [metric_type=system_info]
[2023/02/08 10:29:52.819 +00:00] [DEBUG] [datacoord/services.go:877] [DataCoord.GetMetrics] [node_id=3] [req="{\"metric_type\":\"system_info\"}"] [metric_type=system_info] [metrics="status:<> response:\"{\\\"cluster\\\":{\\\"self\\\":{\\\"has_error\\\":false,\\\"error_reason\\\":\\\"\\\",\\\"name\\\":\\\"datacoord3\\\",\\\"hardware_infos\\\":{\\\"ip\\\":\\\"172.21.29.5:13333\\\",\\\"cpu_core_count\\\":16,\\\"cpu_core_usage\\\":11.071202194292924,\\\"memory\\\":33670107136,\\\"memory_usage\\\":727846912,\\\"disk\\\":104857600,\\\"disk_usage\\\":2097152},\\\"system_info\\\":{\\\"system_version\\\":\\\"0fdc1a04\\\",\\\"deploy_mode\\\":\\\"STANDALONE\\\",\\\"build_version\\\":\\\"v2.2.2\\\",\\\"build_time\\\":\\\"Thu Dec 22 09:08:37 UTC 2022\\\",\\\"used_go_version\\\":\\\"go version go1.18.3 linux/amd64\\\"},\\\"created_time\\\":\\\"2023-02-08 10:06:49.703174194 +0000 UTC m=+0.633691516\\\",\\\"updated_time\\\":\\\"2023-02-08 10:06:49.703174604 +0000 UTC m=+0.633691926\\\",\\\"type\\\":\\\"datacoord\\\",\\\"id\\\":3,\\\"system_configurations\\\":{\\\"segment_max_size\\\":512},\\\"quota_metrics\\\":{\\\"TotalBinlogSize\\\":55857468}},\\\"connected_nodes\\\":[{\\\"has_error\\\":false,\\\"error_reason\\\":\\\"\\\",\\\"name\\\":\\\"datanode6\\\",\\\"hardware_infos\\\":{\\\"ip\\\":\\\"172.21.29.5:21124\\\",\\\"cpu_core_count\\\":16,\\\"cpu_core_usage\\\":13.333333002196419,\\\"memory\\\":33670107136,\\\"memory_usage\\\":727846912,\\\"disk\\\":104857600,\\\"disk_usage\\\":2097152},\\\"system_info\\\":{\\\"system_version\\\":\\\"0fdc1a04\\\",\\\"deploy_mode\\\":\\\"STANDALONE\\\",\\\"build_version\\\":\\\"v2.2.2\\\",\\\"build_time\\\":\\\"Thu Dec 22 09:08:37 UTC 2022\\\",\\\"used_go_version\\\":\\\"go version go1.18.3 linux/amd64\\\"},\\\"created_time\\\":\\\"2023-02-08 10:06:49.90154382 +0000 UTC m=+0.832061122\\\",\\\"updated_time\\\":\\\"2023-02-08 10:06:49.90154405 +0000 UTC m=+0.832061375\\\",\\\"type\\\":\\\"datanode\\\",\\\"id\\\":6,\\\"system_configurations\\\":{\\\"flush_insert_buffer_size\\\":16777216},\\\"quota_metrics\\\":{\\\"Hms\\\":{\\\"ip\\\":\\\"172.21.29.5:21124\\\",\\\"cpu_core_count\\\":16,\\\"cpu_core_usage\\\":13.333333002196419,\\\"memory\\\":33670107136,\\\"memory_usage\\\":727846912,\\\"disk\\\":104857600,\\\"disk_usage\\\":2097152},\\\"Rms\\\":[{\\\"Label\\\":\\\"InsertConsumeThroughput\\\",\\\"Rate\\\":0},{\\\"Label\\\":\\\"DeleteConsumeThroughput\\\",\\\"Rate\\\":0}],\\\"Fgm\\\":{\\\"MinFlowGraphChannel\\\":\\\"by-dev-rootcoord-dml_3_439314234651381188v1\\\",\\\"MinFlowGraphTt\\\":439314597190041601,\\\"NumFlowGraph\\\":2}}}]},\\\"connections\\\":{\\\"name\\\":\\\"datacoord3\\\",\\\"connected_components\\\":[]}}\" component_name:\"datacoord3\" "] []
[2023/02/08 10:29:52.820 +00:00] [INFO] [proxy/multi_rate_limiter.go:96] ["RateLimiter setRates"] [rates="[{\"rt\":5,\"r\":1.7976931348623157e+308},{\"rt\":6,\"r\":1.7976931348623157e+308},{\"rt\":8,\"r\":1.7976931348623157e+308},{\"rt\":9,\"r\":1.7976931348623157e+308},{\"rt\":7,\"r\":1.7976931348623157e+308}]"]
[2023/02/08 10:29:55.801 +00:00] [DEBUG] [querycoordv2/services.go:551] ["get metrics request received"] [msgID=0] [metricType="{\"metric_type\":\"system_info\"}"]
[2023/02/08 10:29:55.807 +00:00] [DEBUG] [proxy/impl.go:3796] [Proxy.GetProxyMetrics] [node_id=8] [req="{\"metric_type\":\"system_info\"}"] [metric_type=system_info]
[2023/02/08 10:29:55.811 +00:00] [DEBUG] [datacoord/services.go:877] [DataCoord.GetMetrics] [node_id=3] [req="{\"metric_type\":\"system_info\"}"] [metric_type=system_info] [metrics="status:<> response:\"{\\\"cluster\\\":{\\\"self\\\":{\\\"has_error\\\":false,\\\"error_reason\\\":\\\"\\\",\\\"name\\\":\\\"datacoord3\\\",\\\"hardware_infos\\\":{\\\"ip\\\":\\\"172.21.29.5:13333\\\",\\\"cpu_core_count\\\":16,\\\"cpu_core_usage\\\":11.49061775445457,\\\"memory\\\":33670107136,\\\"memory_usage\\\":727871488,\\\"disk\\\":104857600,\\\"disk_usage\\\":2097152},\\\"system_info\\\":{\\\"system_version\\\":\\\"0fdc1a04\\\",\\\"deploy_mode\\\":\\\"STANDALONE\\\",\\\"build_version\\\":\\\"v2.2.2\\\",\\\"build_time\\\":\\\"Thu Dec 22 09:08:37 UTC 2022\\\",\\\"used_go_version\\\":\\\"go version go1.18.3 linux/amd64\\\"},\\\"created_time\\\":\\\"2023-02-08 10:06:49.703174194 +0000 UTC m=+0.633691516\\\",\\\"updated_time\\\":\\\"2023-02-08 10:06:49.703174604 +0000 UTC m=+0.633691926\\\",\\\"type\\\":\\\"datacoord\\\",\\\"id\\\":3,\\\"system_configurations\\\":{\\\"segment_max_size\\\":512},\\\"quota_metrics\\\":{\\\"TotalBinlogSize\\\":55857468}},\\\"connected_nodes\\\":[{\\\"has_error\\\":false,\\\"error_reason\\\":\\\"\\\",\\\"name\\\":\\\"datanode6\\\",\\\"hardware_infos\\\":{\\\"ip\\\":\\\"172.21.29.5:21124\\\",\\\"cpu_core_count\\\":16,\\\"cpu_core_usage\\\":0,\\\"memory\\\":33670107136,\\\"memory_usage\\\":727871488,\\\"disk\\\":104857600,\\\"disk_usage\\\":2097152},\\\"system_info\\\":{\\\"system_version\\\":\\\"0fdc1a04\\\",\\\"deploy_mode\\\":\\\"STANDALONE\\\",\\\"build_version\\\":\\\"v2.2.2\\\",\\\"build_time\\\":\\\"Thu Dec 22 09:08:37 UTC 2022\\\",\\\"used_go_version\\\":\\\"go version go1.18.3 linux/amd64\\\"},\\\"created_time\\\":\\\"2023-02-08 10:06:49.90154382 +0000 UTC m=+0.832061122\\\",\\\"updated_time\\\":\\\"2023-02-08 10:06:49.90154405 +0000 UTC m=+0.832061375\\\",\\\"type\\\":\\\"datanode\\\",\\\"id\\\":6,\\\"system_configurations\\\":{\\\"flush_insert_buffer_size\\\":16777216},\\\"quota_metrics\\\":{\\\"Hms\\\":{\\\"ip\\\":\\\"172.21.29.5:21124\\\",\\\"cpu_core_count\\\":16,\\\"cpu_core_usage\\\":0,\\\"memory\\\":33670107136,\\\"memory_usage\\\":727871488,\\\"disk\\\":104857600,\\\"disk_usage\\\":2097152},\\\"Rms\\\":[{\\\"Label\\\":\\\"InsertConsumeThroughput\\\",\\\"Rate\\\":0},{\\\"Label\\\":\\\"DeleteConsumeThroughput\\\",\\\"Rate\\\":0}],\\\"Fgm\\\":{\\\"MinFlowGraphChannel\\\":\\\"by-dev-rootcoord-dml_3_439314234651381188v1\\\",\\\"MinFlowGraphTt\\\":439314597976473601,\\\"NumFlowGraph\\\":2}}}]},\\\"connections\\\":{\\\"name\\\":\\\"datacoord3\\\",\\\"connected_components\\\":[]}}\" component_name:\"datacoord3\" "] []
[2023/02/08 10:29:55.813 +00:00] [INFO] [proxy/multi_rate_limiter.go:96] ["RateLimiter setRates"] [rates="[{\"rt\":7,\"r\":1.7976931348623157e+308},{\"rt\":5,\"r\":1.7976931348623157e+308},{\"rt\":6,\"r\":1.7976931348623157e+308},{\"rt\":8,\"r\":1.7976931348623157e+308},{\"rt\":9,\"r\":1.7976931348623157e+308}]"]
[2023/02/08 10:29:58.801 +00:00] [DEBUG] [querycoordv2/services.go:551] ["get metrics request received"] [msgID=0] [metricType="{\"metric_type\":\"system_info\"}"]
[2023/02/08 10:29:58.812 +00:00] [DEBUG] [proxy/impl.go:3796] [Proxy.GetProxyMetrics] [node_id=8] [req="{\"metric_type\":\"system_info\"}"] [metric_type=system_info]
[2023/02/08 10:29:58.812 +00:00] [DEBUG] [datacoord/services.go:877] [DataCoord.GetMetrics] [node_id=3] [req="{\"metric_type\":\"system_info\"}"] [metric_type=system_info] [metrics="status:<> response:\"{\\\"cluster\\\":{\\\"self\\\":{\\\"has_error\\\":false,\\\"error_reason\\\":\\\"\\\",\\\"name\\\":\\\"datacoord3\\\",\\\"hardware_infos\\\":{\\\"ip\\\":\\\"172.21.29.5:13333\\\",\\\"cpu_core_count\\\":16,\\\"cpu_core_usage\\\":0,\\\"memory\\\":33670107136,\\\"memory_usage\\\":727887872,\\\"disk\\\":104857600,\\\"disk_usage\\\":2097152},\\\"system_info\\\":{\\\"system_version\\\":\\\"0fdc1a04\\\",\\\"deploy_mode\\\":\\\"STANDALONE\\\",\\\"build_version\\\":\\\"v2.2.2\\\",\\\"build_time\\\":\\\"Thu Dec 22 09:08:37 UTC 2022\\\",\\\"used_go_version\\\":\\\"go version go1.18.3 linux/amd64\\\"},\\\"created_time\\\":\\\"2023-02-08 10:06:49.703174194 +0000 UTC m=+0.633691516\\\",\\\"updated_time\\\":\\\"2023-02-08 10:06:49.703174604 +0000 UTC m=+0.633691926\\\",\\\"type\\\":\\\"datacoord\\\",\\\"id\\\":3,\\\"system_configurations\\\":{\\\"segment_max_size\\\":512},\\\"quota_metrics\\\":{\\\"TotalBinlogSize\\\":55857468}},\\\"connected_nodes\\\":[{\\\"has_error\\\":false,\\\"error_reason\\\":\\\"\\\",\\\"name\\\":\\\"datanode6\\\",\\\"hardware_infos\\\":{\\\"ip\\\":\\\"172.21.29.5:21124\\\",\\\"cpu_core_count\\\":16,\\\"cpu_core_usage\\\":75,\\\"memory\\\":33670107136,\\\"memory_usage\\\":727887872,\\\"disk\\\":104857600,\\\"disk_usage\\\":2097152},\\\"system_info\\\":{\\\"system_version\\\":\\\"0fdc1a04\\\",\\\"deploy_mode\\\":\\\"STANDALONE\\\",\\\"build_version\\\":\\\"v2.2.2\\\",\\\"build_time\\\":\\\"Thu Dec 22 09:08:37 UTC 2022\\\",\\\"used_go_version\\\":\\\"go version go1.18.3 linux/amd64\\\"},\\\"created_time\\\":\\\"2023-02-08 10:06:49.90154382 +0000 UTC m=+0.832061122\\\",\\\"updated_time\\\":\\\"2023-02-08 10:06:49.90154405 +0000 UTC m=+0.832061375\\\",\\\"type\\\":\\\"datanode\\\",\\\"id\\\":6,\\\"system_configurations\\\":{\\\"flush_insert_buffer_size\\\":16777216},\\\"quota_metrics\\\":{\\\"Hms\\\":{\\\"ip\\\":\\\"172.21.29.5:21124\\\",\\\"cpu_core_count\\\":16,\\\"cpu_core_usage\\\":75,\\\"memory\\\":33670107136,\\\"memory_usage\\\":727887872,\\\"disk\\\":104857600,\\\"disk_usage\\\":2097152},\\\"Rms\\\":[{\\\"Label\\\":\\\"InsertConsumeThroughput\\\",\\\"Rate\\\":0},{\\\"Label\\\":\\\"DeleteConsumeThroughput\\\",\\\"Rate\\\":0}],\\\"Fgm\\\":{\\\"MinFlowGraphChannel\\\":\\\"by-dev-rootcoord-dml_3_439314234651381188v1\\\",\\\"MinFlowGraphTt\\\":439314598762905601,\\\"NumFlowGraph\\\":2}}}]},\\\"connections\\\":{\\\"name\\\":\\\"datacoord3\\\",\\\"connected_components\\\":[]}}\" component_name:\"datacoord3\" "] []
[2023/02/08 10:29:58.814 +00:00] [INFO] [proxy/multi_rate_limiter.go:96] ["RateLimiter setRates"] [rates="[{\"rt\":8,\"r\":1.7976931348623157e+308},{\"rt\":9,\"r\":1.7976931348623157e+308},{\"rt\":7,\"r\":1.7976931348623157e+308},{\"rt\":5,\"r\":1.7976931348623157e+308},{\"rt\":6,\"r\":1.7976931348623157e+308}]"]
[2023/02/08 10:29:59.801 +00:00] [DEBUG] [datanode/flow_graph_dd_node.go:329] ["DDNode sent delta timeTick"] [collectionID=439314234651381188] [ts=439314599011942402] [ts_p=2023/02/08 10:29:59.600 +00:00] [channel=by-dev-rootcoord-dml_3_439314234651381188v1]
[2023/02/08 10:30:01.800 +00:00] [DEBUG] [querycoordv2/services.go:551] ["get metrics request received"] [msgID=0] [metricType="{\"metric_type\":\"system_info\"}"]
[2023/02/08 10:30:01.804 +00:00] [DEBUG] [proxy/impl.go:3796] [Proxy.GetProxyMetrics] [node_id=8] [req="{\"metric_type\":\"system_info\"}"] [metric_type=system_info]
[2023/02/08 10:30:01.816 +00:00] [DEBUG] [datacoord/services.go:877] [DataCoord.GetMetrics] [node_id=3] [req="{\"metric_type\":\"system_info\"}"] [metric_type=system_info] [metrics="status:<> response:\"{\\\"cluster\\\":{\\\"self\\\":{\\\"has_error\\\":false,\\\"error_reason\\\":\\\"\\\",\\\"name\\\":\\\"datacoord3\\\",\\\"hardware_infos\\\":{\\\"ip\\\":\\\"172.21.29.5:13333\\\",\\\"cpu_core_count\\\":16,\\\"cpu_core_usage\\\":0,\\\"memory\\\":33670107136,\\\"memory_usage\\\":728170496,\\\"disk\\\":104857600,\\\"disk_usage\\\":2097152},\\\"system_info\\\":{\\\"system_version\\\":\\\"0fdc1a04\\\",\\\"deploy_mode\\\":\\\"STANDALONE\\\",\\\"build_version\\\":\\\"v2.2.2\\\",\\\"build_time\\\":\\\"Thu Dec 22 09:08:37 UTC 2022\\\",\\\"used_go_version\\\":\\\"go version go1.18.3 linux/amd64\\\"},\\\"created_time\\\":\\\"2023-02-08 10:06:49.703174194 +0000 UTC m=+0.633691516\\\",\\\"updated_time\\\":\\\"2023-02-08 10:06:49.703174604 +0000 UTC m=+0.633691926\\\",\\\"type\\\":\\\"datacoord\\\",\\\"id\\\":3,\\\"system_configurations\\\":{\\\"segment_max_size\\\":512},\\\"quota_metrics\\\":{\\\"TotalBinlogSize\\\":55857468}},\\\"connected_nodes\\\":[{\\\"has_error\\\":false,\\\"error_reason\\\":\\\"\\\",\\\"name\\\":\\\"datanode6\\\",\\\"hardware_infos\\\":{\\\"ip\\\":\\\"172.21.29.5:21124\\\",\\\"cpu_core_count\\\":16,\\\"cpu_core_usage\\\":25.000001746229827,\\\"memory\\\":33670107136,\\\"memory_usage\\\":728170496,\\\"disk\\\":104857600,\\\"disk_usage\\\":2097152},\\\"system_info\\\":{\\\"system_version\\\":\\\"0fdc1a04\\\",\\\"deploy_mode\\\":\\\"STANDALONE\\\",\\\"build_version\\\":\\\"v2.2.2\\\",\\\"build_time\\\":\\\"Thu Dec 22 09:08:37 UTC 2022\\\",\\\"used_go_version\\\":\\\"go version go1.18.3 linux/amd64\\\"},\\\"created_time\\\":\\\"2023-02-08 10:06:49.90154382 +0000 UTC m=+0.832061122\\\",\\\"updated_time\\\":\\\"2023-02-08 10:06:49.90154405 +0000 UTC m=+0.832061375\\\",\\\"type\\\":\\\"datanode\\\",\\\"id\\\":6,\\\"system_configurations\\\":{\\\"flush_insert_buffer_size\\\":16777216},\\\"quota_metrics\\\":{\\\"Hms\\\":{\\\"ip\\\":\\\"172.21.29.5:21124\\\",\\\"cpu_core_count\\\":16,\\\"cpu_core_usage\\\":25.000001746229827,\\\"memory\\\":33670107136,\\\"memory_usage\\\":728170496,\\\"disk\\\":104857600,\\\"disk_usage\\\":2097152},\\\"Rms\\\":[{\\\"Label\\\":\\\"InsertConsumeThroughput\\\",\\\"Rate\\\":0},{\\\"Label\\\":\\\"DeleteConsumeThroughput\\\",\\\"Rate\\\":0}],\\\"Fgm\\\":{\\\"MinFlowGraphChannel\\\":\\\"by-dev-rootcoord-dml_2_439314234651381188v0\\\",\\\"MinFlowGraphTt\\\":439314599536230402,\\\"NumFlowGraph\\\":2}}}]},\\\"connections\\\":{\\\"name\\\":\\\"datacoord3\\\",\\\"connected_components\\\":[]}}\" component_name:\"datacoord3\" "] []
[2023/02/08 10:30:01.817 +00:00] [INFO] [proxy/multi_rate_limiter.go:96] ["RateLimiter setRates"] [rates="[{\"rt\":8,\"r\":1.7976931348623157e+308},{\"rt\":9,\"r\":1.7976931348623157e+308},{\"rt\":7,\"r\":1.7976931348623157e+308},{\"rt\":5,\"r\":1.7976931348623157e+308},{\"rt\":6,\"r\":1.7976931348623157e+308}]"]
[2023/02/08 10:30:04.800 +00:00] [INFO] [rootcoord/import_manager.go:164] ["start trying to flip task state"]
[2023/02/08 10:30:04.800 +00:00] [INFO] [rootcoord/import_manager.go:780] ["import manager starts loading from Etcd"]
[2023/02/08 10:30:04.800 +00:00] [DEBUG] [querycoordv2/services.go:551] ["get metrics request received"] [msgID=0] [metricType="{\"metric_type\":\"system_info\"}"]
[2023/02/08 10:30:04.808 +00:00] [DEBUG] [proxy/impl.go:3796] [Proxy.GetProxyMetrics] [node_id=8] [req="{\"metric_type\":\"system_info\"}"] [metric_type=system_info]
[2023/02/08 10:30:04.812 +00:00] [DEBUG] [datacoord/services.go:877] [DataCoord.GetMetrics] [node_id=3] [req="{\"metric_type\":\"system_info\"}"] [metric_type=system_info] [metrics="status:<> response:\"{\\\"cluster\\\":{\\\"self\\\":{\\\"has_error\\\":false,\\\"error_reason\\\":\\\"\\\",\\\"name\\\":\\\"datacoord3\\\",\\\"hardware_infos\\\":{\\\"ip\\\":\\\"172.21.29.5:13333\\\",\\\"cpu_core_count\\\":16,\\\"cpu_core_usage\\\":0,\\\"memory\\\":33670107136,\\\"memory_usage\\\":727941120,\\\"disk\\\":104857600,\\\"disk_usage\\\":2097152},\\\"system_info\\\":{\\\"system_version\\\":\\\"0fdc1a04\\\",\\\"deploy_mode\\\":\\\"STANDALONE\\\",\\\"build_version\\\":\\\"v2.2.2\\\",\\\"build_time\\\":\\\"Thu Dec 22 09:08:37 UTC 2022\\\",\\\"used_go_version\\\":\\\"go version go1.18.3 linux/amd64\\\"},\\\"created_time\\\":\\\"2023-02-08 10:06:49.703174194 +0000 UTC m=+0.633691516\\\",\\\"updated_time\\\":\\\"2023-02-08 10:06:49.703174604 +0000 UTC m=+0.633691926\\\",\\\"type\\\":\\\"datacoord\\\",\\\"id\\\":3,\\\"system_configurations\\\":{\\\"segment_max_size\\\":512},\\\"quota_metrics\\\":{\\\"TotalBinlogSize\\\":55857468}},\\\"connected_nodes\\\":[{\\\"has_error\\\":false,\\\"error_reason\\\":\\\"\\\",\\\"name\\\":\\\"datanode6\\\",\\\"hardware_infos\\\":{\\\"ip\\\":\\\"172.21.29.5:21124\\\",\\\"cpu_core_count\\\":16,\\\"cpu_core_usage\\\":75,\\\"memory\\\":33670107136,\\\"memory_usage\\\":728072192,\\\"disk\\\":104857600,\\\"disk_usage\\\":2097152},\\\"system_info\\\":{\\\"system_version\\\":\\\"0fdc1a04\\\",\\\"deploy_mode\\\":\\\"STANDALONE\\\",\\\"build_version\\\":\\\"v2.2.2\\\",\\\"build_time\\\":\\\"Thu Dec 22 09:08:37 UTC 2022\\\",\\\"used_go_version\\\":\\\"go version go1.18.3 linux/amd64\\\"},\\\"created_time\\\":\\\"2023-02-08 10:06:49.90154382 +0000 UTC m=+0.832061122\\\",\\\"updated_time\\\":\\\"2023-02-08 10:06:49.90154405 +0000 UTC m=+0.832061375\\\",\\\"type\\\":\\\"datanode\\\",\\\"id\\\":6,\\\"system_configurations\\\":{\\\"flush_insert_buffer_size\\\":16777216},\\\"quota_metrics\\\":{\\\"Hms\\\":{\\\"ip\\\":\\\"172.21.29.5:21124\\\",\\\"cpu_core_count\\\":16,\\\"cpu_core_usage\\\":75,\\\"memory\\\":33670107136,\\\"memory_usage\\\":728072192,\\\"disk\\\":104857600,\\\"disk_usage\\\":2097152},\\\"Rms\\\":[{\\\"Label\\\":\\\"InsertConsumeThroughput\\\",\\\"Rate\\\":0},{\\\"Label\\\":\\\"DeleteConsumeThroughput\\\",\\\"Rate\\\":0}],\\\"Fgm\\\":{\\\"MinFlowGraphChannel\\\":\\\"by-dev-rootcoord-dml_2_439314234651381188v0\\\",\\\"MinFlowGraphTt\\\":439314600322662402,\\\"NumFlowGraph\\\":2}}}]},\\\"connections\\\":{\\\"name\\\":\\\"datacoord3\\\",\\\"connected_components\\\":[]}}\" component_name:\"datacoord3\" "] []
[2023/02/08 10:30:04.814 +00:00] [INFO] [proxy/multi_rate_limiter.go:96] ["RateLimiter setRates"] [rates="[{\"rt\":5,\"r\":1.7976931348623157e+308},{\"rt\":6,\"r\":1.7976931348623157e+308},{\"rt\":8,\"r\":1.7976931348623157e+308},{\"rt\":9,\"r\":1.7976931348623157e+308},{\"rt\":7,\"r\":1.7976931348623157e+308}]"]
[2023/02/08 10:30:07.802 +00:00] [DEBUG] [querycoordv2/services.go:551] ["get metrics request received"] [msgID=0] [metricType="{\"metric_type\":\"system_info\"}"]
[2023/02/08 10:30:07.811 +00:00] [DEBUG] [proxy/impl.go:3796] [Proxy.GetProxyMetrics] [node_id=8] [req="{\"metric_type\":\"system_info\"}"] [metric_type=system_info]
[2023/02/08 10:30:07.815 +00:00] [DEBUG] [datacoord/services.go:877] [DataCoord.GetMetrics] [node_id=3] [req="{\"metric_type\":\"system_info\"}"] [metric_type=system_info] [metrics="status:<> response:\"{\\\"cluster\\\":{\\\"self\\\":{\\\"has_error\\\":false,\\\"error_reason\\\":\\\"\\\",\\\"name\\\":\\\"datacoord3\\\",\\\"hardware_infos\\\":{\\\"ip\\\":\\\"172.21.29.5:13333\\\",\\\"cpu_core_count\\\":16,\\\"cpu_core_usage\\\":11.832946641364268,\\\"memory\\\":33670107136,\\\"memory_usage\\\":727859200,\\\"disk\\\":104857600,\\\"disk_usage\\\":2097152},\\\"system_info\\\":{\\\"system_version\\\":\\\"0fdc1a04\\\",\\\"deploy_mode\\\":\\\"STANDALONE\\\",\\\"build_version\\\":\\\"v2.2.2\\\",\\\"build_time\\\":\\\"Thu Dec 22 09:08:37 UTC 2022\\\",\\\"used_go_version\\\":\\\"go version go1.18.3 linux/amd64\\\"},\\\"created_time\\\":\\\"2023-02-08 10:06:49.703174194 +0000 UTC m=+0.633691516\\\",\\\"updated_time\\\":\\\"2023-02-08 10:06:49.703174604 +0000 UTC m=+0.633691926\\\",\\\"type\\\":\\\"datacoord\\\",\\\"id\\\":3,\\\"system_configurations\\\":{\\\"segment_max_size\\\":512},\\\"quota_metrics\\\":{\\\"TotalBinlogSize\\\":55857468}},\\\"connected_nodes\\\":[{\\\"has_error\\\":false,\\\"error_reason\\\":\\\"\\\",\\\"name\\\":\\\"datanode6\\\",\\\"hardware_infos\\\":{\\\"ip\\\":\\\"172.21.29.5:21124\\\",\\\"cpu_core_count\\\":16,\\\"cpu_core_usage\\\":0,\\\"memory\\\":33670107136,\\\"memory_usage\\\":727859200,\\\"disk\\\":104857600,\\\"disk_usage\\\":2097152},\\\"system_info\\\":{\\\"system_version\\\":\\\"0fdc1a04\\\",\\\"deploy_mode\\\":\\\"STANDALONE\\\",\\\"build_version\\\":\\\"v2.2.2\\\",\\\"build_time\\\":\\\"Thu Dec 22 09:08:37 UTC 2022\\\",\\\"used_go_version\\\":\\\"go version go1.18.3 linux/amd64\\\"},\\\"created_time\\\":\\\"2023-02-08 10:06:49.90154382 +0000 UTC m=+0.832061122\\\",\\\"updated_time\\\":\\\"2023-02-08 10:06:49.90154405 +0000 UTC m=+0.832061375\\\",\\\"type\\\":\\\"datanode\\\",\\\"id\\\":6,\\\"system_configurations\\\":{\\\"flush_insert_buffer_size\\\":16777216},\\\"quota_metrics\\\":{\\\"Hms\\\":{\\\"ip\\\":\\\"172.21.29.5:21124\\\",\\\"cpu_core_count\\\":16,\\\"cpu_core_usage\\\":0,\\\"memory\\\":33670107136,\\\"memory_usage\\\":727859200,\\\"disk\\\":104857600,\\\"disk_usage\\\":2097152},\\\"Rms\\\":[{\\\"Label\\\":\\\"InsertConsumeThroughput\\\",\\\"Rate\\\":0},{\\\"Label\\\":\\\"DeleteConsumeThroughput\\\",\\\"Rate\\\":0}],\\\"Fgm\\\":{\\\"MinFlowGraphChannel\\\":\\\"by-dev-rootcoord-dml_2_439314234651381188v0\\\",\\\"MinFlowGraphTt\\\":439314601109094402,\\\"NumFlowGraph\\\":2}}}]},\\\"connections\\\":{\\\"name\\\":\\\"datacoord3\\\",\\\"connected_components\\\":[]}}\" component_name:\"datacoord3\" "] []
[2023/02/08 10:30:07.816 +00:00] [INFO] [proxy/multi_rate_limiter.go:96] ["RateLimiter setRates"] [rates="[{\"rt\":8,\"r\":1.7976931348623157e+308},{\"rt\":9,\"r\":1.7976931348623157e+308},{\"rt\":7,\"r\":1.7976931348623157e+308},{\"rt\":5,\"r\":1.7976931348623157e+308},{\"rt\":6,\"r\":1.7976931348623157e+308}]"]
[2023/02/08 10:30:09.801 +00:00] [DEBUG] [querynode/flow_graph_service_time_node.go:70] ["update tSafe:"] [collectionID=439314234651381188] [tSafe=439314601620537346] [tSafe_p=2023/02/08 10:30:09.551 +00:00] [tsLag=250.483148ms] [channel=by-dev-rootcoord-dml_2_439314234651381188v0]
[2023/02/08 10:30:10.801 +00:00] [DEBUG] [querycoordv2/services.go:551] ["get metrics request received"] [msgID=0] [metricType="{\"metric_type\":\"system_info\"}"]
[2023/02/08 10:30:10.806 +00:00] [DEBUG] [proxy/impl.go:3796] [Proxy.GetProxyMetrics] [node_id=8] [req="{\"metric_type\":\"system_info\"}"] [metric_type=system_info]
[2023/02/08 10:30:10.814 +00:00] [DEBUG] [datacoord/services.go:877] [DataCoord.GetMetrics] [node_id=3] [req="{\"metric_type\":\"system_info\"}"] [metric_type=system_info] [metrics="status:<> response:\"{\\\"cluster\\\":{\\\"self\\\":{\\\"has_error\\\":false,\\\"error_reason\\\":\\\"\\\",\\\"name\\\":\\\"datacoord3\\\",\\\"hardware_infos\\\":{\\\"ip\\\":\\\"172.21.29.5:13333\\\",\\\"cpu_core_count\\\":16,\\\"cpu_core_usage\\\":0.000018626448439196754,\\\"memory\\\":33670107136,\\\"memory_usage\\\":727883776,\\\"disk\\\":104857600,\\\"disk_usage\\\":2097152},\\\"system_info\\\":{\\\"system_version\\\":\\\"0fdc1a04\\\",\\\"deploy_mode\\\":\\\"STANDALONE\\\",\\\"build_version\\\":\\\"v2.2.2\\\",\\\"build_time\\\":\\\"Thu Dec 22 09:08:37 UTC 2022\\\",\\\"used_go_version\\\":\\\"go version go1.18.3 linux/amd64\\\"},\\\"created_time\\\":\\\"2023-02-08 10:06:49.703174194 +0000 UTC m=+0.633691516\\\",\\\"updated_time\\\":\\\"2023-02-08 10:06:49.703174604 +0000 UTC m=+0.633691926\\\",\\\"type\\\":\\\"datacoord\\\",\\\"id\\\":3,\\\"system_configurations\\\":{\\\"segment_max_size\\\":512},\\\"quota_metrics\\\":{\\\"TotalBinlogSize\\\":55857468}},\\\"connected_nodes\\\":[{\\\"has_error\\\":false,\\\"error_reason\\\":\\\"\\\",\\\"name\\\":\\\"datanode6\\\",\\\"hardware_infos\\\":{\\\"ip\\\":\\\"172.21.29.5:21124\\\",\\\"cpu_core_count\\\":16,\\\"cpu_core_usage\\\":0,\\\"memory\\\":33670107136,\\\"memory_usage\\\":727883776,\\\"disk\\\":104857600,\\\"disk_usage\\\":2097152},\\\"system_info\\\":{\\\"system_version\\\":\\\"0fdc1a04\\\",\\\"deploy_mode\\\":\\\"STANDALONE\\\",\\\"build_version\\\":\\\"v2.2.2\\\",\\\"build_time\\\":\\\"Thu Dec 22 09:08:37 UTC 2022\\\",\\\"used_go_version\\\":\\\"go version go1.18.3 linux/amd64\\\"},\\\"created_time\\\":\\\"2023-02-08 10:06:49.90154382 +0000 UTC m=+0.832061122\\\",\\\"updated_time\\\":\\\"2023-02-08 10:06:49.90154405 +0000 UTC m=+0.832061375\\\",\\\"type\\\":\\\"datanode\\\",\\\"id\\\":6,\\\"system_configurations\\\":{\\\"flush_insert_buffer_size\\\":16777216},\\\"quota_metrics\\\":{\\\"Hms\\\":{\\\"ip\\\":\\\"172.21.29.5:21124\\\",\\\"cpu_core_count\\\":16,\\\"cpu_core_usage\\\":0,\\\"memory\\\":33670107136,\\\"memory_usage\\\":727883776,\\\"disk\\\":104857600,\\\"disk_usage\\\":2097152},\\\"Rms\\\":[{\\\"Label\\\":\\\"InsertConsumeThroughput\\\",\\\"Rate\\\":0},{\\\"Label\\\":\\\"DeleteConsumeThroughput\\\",\\\"Rate\\\":0}],\\\"Fgm\\\":{\\\"MinFlowGraphChannel\\\":\\\"by-dev-rootcoord-dml_2_439314234651381188v0\\\",\\\"MinFlowGraphTt\\\":439314601895526402,\\\"NumFlowGraph\\\":2}}}]},\\\"connections\\\":{\\\"name\\\":\\\"datacoord3\\\",\\\"connected_components\\\":[]}}\" component_name:\"datacoord3\" "] []
[2023/02/08 10:30:10.820 +00:00] [INFO] [proxy/multi_rate_limiter.go:96] ["RateLimiter setRates"] [rates="[{\"rt\":8,\"r\":1.7976931348623157e+308},{\"rt\":9,\"r\":1.7976931348623157e+308},{\"rt\":7,\"r\":1.7976931348623157e+308},{\"rt\":5,\"r\":1.7976931348623157e+308},{\"rt\":6,\"r\":1.7976931348623157e+308}]"]
[2023/02/08 10:30:12.601 +00:00] [DEBUG] [datanode/channel_meta.go:666] ["getChannelCheckpoint for segment"] [segmentID=439314234651381219] [isCurIBEmpty=true] [isCurDBEmpty=true] [len(hisIB)=0] [len(hisDB)=0]
[2023/02/08 10:30:12.604 +00:00] [DEBUG] [datacoord/meta.go:1196] ["UpdateChannelCheckpoint done"] [vChannel=by-dev-rootcoord-dml_2_439314234651381188v0] [time=2023/02/08 10:30:12.550 +00:00]
[2023/02/08 10:30:12.604 +00:00] [INFO] [datanode/flow_graph_time_tick_node.go:108] ["UpdateChannelCheckpoint success"] [channel=by-dev-rootcoord-dml_2_439314234651381188v0] [channelCPTs=2023/02/08 10:30:12.550 +00:00]
[2023/02/08 10:30:12.605 +00:00] [DEBUG] [datanode/channel_meta.go:666] ["getChannelCheckpoint for segment"] [segmentID=439314234651381218] [isCurIBEmpty=true] [isCurDBEmpty=true] [len(hisIB)=0] [len(hisDB)=0]
[2023/02/08 10:30:12.606 +00:00] [DEBUG] [datacoord/meta.go:1196] ["UpdateChannelCheckpoint done"] [vChannel=by-dev-rootcoord-dml_3_439314234651381188v1] [time=2023/02/08 10:30:12.550 +00:00]
[2023/02/08 10:30:12.607 +00:00] [INFO] [datanode/flow_graph_time_tick_node.go:108] ["UpdateChannelCheckpoint success"] [channel=by-dev-rootcoord-dml_3_439314234651381188v1] [channelCPTs=2023/02/08 10:30:12.550 +00:00]
[2023/02/08 10:30:13.806 +00:00] [DEBUG] [querycoordv2/services.go:551] ["get metrics request received"] [msgID=0] [metricType="{\"metric_type\":\"system_info\"}"]
[2023/02/08 10:30:13.808 +00:00] [DEBUG] [proxy/impl.go:3796] [Proxy.GetProxyMetrics] [node_id=8] [req="{\"metric_type\":\"system_info\"}"] [metric_type=system_info]
[2023/02/08 10:30:13.814 +00:00] [DEBUG] [datacoord/services.go:877] [DataCoord.GetMetrics] [node_id=3] [req="{\"metric_type\":\"system_info\"}"] [metric_type=system_info] [metrics="status:<> response:\"{\\\"cluster\\\":{\\\"self\\\":{\\\"has_error\\\":false,\\\"error_reason\\\":\\\"\\\",\\\"name\\\":\\\"datacoord3\\\",\\\"hardware_infos\\\":{\\\"ip\\\":\\\"172.21.29.5:13333\\\",\\\"cpu_core_count\\\":16,\\\"cpu_core_usage\\\":9.330244312555235,\\\"memory\\\":33670107136,\\\"memory_usage\\\":728035328,\\\"disk\\\":104857600,\\\"disk_usage\\\":2097152},\\\"system_info\\\":{\\\"system_version\\\":\\\"0fdc1a04\\\",\\\"deploy_mode\\\":\\\"STANDALONE\\\",\\\"build_version\\\":\\\"v2.2.2\\\",\\\"build_time\\\":\\\"Thu Dec 22 09:08:37 UTC 2022\\\",\\\"used_go_version\\\":\\\"go version go1.18.3 linux/amd64\\\"},\\\"created_time\\\":\\\"2023-02-08 10:06:49.703174194 +0000 UTC m=+0.633691516\\\",\\\"updated_time\\\":\\\"2023-02-08 10:06:49.703174604 +0000 UTC m=+0.633691926\\\",\\\"type\\\":\\\"datacoord\\\",\\\"id\\\":3,\\\"system_configurations\\\":{\\\"segment_max_size\\\":512},\\\"quota_metrics\\\":{\\\"TotalBinlogSize\\\":55857468}},\\\"connected_nodes\\\":[{\\\"has_error\\\":false,\\\"error_reason\\\":\\\"\\\",\\\"name\\\":\\\"datanode6\\\",\\\"hardware_infos\\\":{\\\"ip\\\":\\\"172.21.29.5:21124\\\",\\\"cpu_core_count\\\":16,\\\"cpu_core_usage\\\":22.22222176230984,\\\"memory\\\":33670107136,\\\"memory_usage\\\":728035328,\\\"disk\\\":104857600,\\\"disk_usage\\\":2097152},\\\"system_info\\\":{\\\"system_version\\\":\\\"0fdc1a04\\\",\\\"deploy_mode\\\":\\\"STANDALONE\\\",\\\"build_version\\\":\\\"v2.2.2\\\",\\\"build_time\\\":\\\"Thu Dec 22 09:08:37 UTC 2022\\\",\\\"used_go_version\\\":\\\"go version go1.18.3 linux/amd64\\\"},\\\"created_time\\\":\\\"2023-02-08 10:06:49.90154382 +0000 UTC m=+0.832061122\\\",\\\"updated_time\\\":\\\"2023-02-08 10:06:49.90154405 +0000 UTC m=+0.832061375\\\",\\\"type\\\":\\\"datanode\\\",\\\"id\\\":6,\\\"system_configurations\\\":{\\\"flush_insert_buffer_size\\\":16777216},\\\"quota_metrics\\\":{\\\"Hms\\\":{\\\"ip\\\":\\\"172.21.29.5:21124\\\",\\\"cpu_core_count\\\":16,\\\"cpu_core_usage\\\":22.22222176230984,\\\"memory\\\":33670107136,\\\"memory_usage\\\":728035328,\\\"disk\\\":104857600,\\\"disk_usage\\\":2097152},\\\"Rms\\\":[{\\\"Label\\\":\\\"InsertConsumeThroughput\\\",\\\"Rate\\\":0},{\\\"Label\\\":\\\"DeleteConsumeThroughput\\\",\\\"Rate\\\":0}],\\\"Fgm\\\":{\\\"MinFlowGraphChannel\\\":\\\"by-dev-rootcoord-dml_2_439314234651381188v0\\\",\\\"MinFlowGraphTt\\\":439314602681958402,\\\"NumFlowGraph\\\":2}}}]},\\\"connections\\\":{\\\"name\\\":\\\"datacoord3\\\",\\\"connected_components\\\":[]}}\" component_name:\"datacoord3\" "] []
[2023/02/08 10:30:13.818 +00:00] [INFO] [proxy/multi_rate_limiter.go:96] ["RateLimiter setRates"] [rates="[{\"rt\":6,\"r\":1.7976931348623157e+308},{\"rt\":8,\"r\":1.7976931348623157e+308},{\"rt\":9,\"r\":1.7976931348623157e+308},{\"rt\":7,\"r\":1.7976931348623157e+308},{\"rt\":5,\"r\":1.7976931348623157e+308}]"]
╭─pon@admini ~/opt/docker-compose-public/milvus  ‹master*› 
╰─➤  

Anything else?

version: "3.5"

services:
  etcd:
    container_name: milvus-etcd
    image: quay.io/coreos/etcd:v3.5.0
    restart: always
    environment:
      - ETCD_AUTO_COMPACTION_MODE=revision
      - ETCD_AUTO_COMPACTION_RETENTION=1000
      - ETCD_QUOTA_BACKEND_BYTES=4294967296
      - ETCD_SNAPSHOT_COUNT=50000
    volumes:
      - ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/etcd:/etcd
    command: etcd -advertise-client-urls=http://127.0.0.1:2379 -listen-client-urls http://0.0.0.0:2379 --data-dir /etcd
    logging:
      driver: "json-file"
      options:
        max-file: "1"
        max-size: "50m"

  minio:
    container_name: milvus-minio
    image: minio/minio:RELEASE.2022-03-17T06-34-49Z
    restart: always
    environment:
      MINIO_ACCESS_KEY: minioadmin
      MINIO_SECRET_KEY: minioadmin
    ports:
      - "9001:9001"
    volumes:
      - ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/minio:/minio_data
    command: minio server /minio_data --console-address ":9001"
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
      interval: 30s
      timeout: 20s
      retries: 3
    logging:
      driver: "json-file"
      options:
        max-file: "1"
        max-size: "50m"

  standalone:
    container_name: milvus-standalone
    image: milvusdb/milvus:v2.2.2
    command: ["milvus", "run", "standalone"]
    restart: always
    environment:
      ETCD_ENDPOINTS: etcd:2379
      MINIO_ADDRESS: minio:9000
    volumes:
      - ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/milvus:/var/lib/milvus
    ports:
      - "19530:19530"
      - "9091:9091"
    depends_on:
      - "etcd"
      - "minio"
    logging:
      driver: "json-file"
      options:
        max-file: "1"
        max-size: "50m"

  zilliz_attu:
    container_name: zilliz_attu
    image: zilliz/attu:v2.2.2
    restart: always
    environment:
      HOST_URL: http://192.168.38.191:8000
      MILVUS_URL: 192.168.38.191:19530
    ports:
      - "8000:3000"

networks:
  default:
    name: milvus

idocker ps The relevant containers are running normally

4d82fc82f50e   running    /milvus-etcd                  25.45 MB
41cc5df6f963   running    /milvus-minio                 136.1 MB
6bdcf49a127f   running    /milvus-standalone           759.41 MB
@ponponon ponponon added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Feb 8, 2023
@ponponon
Copy link
Author

ponponon commented Feb 8, 2023

milvus_error.log

Complete log file

@yanliang567
Copy link
Contributor

/assign @congqixia
saw an error:
[2023/02/08 10:42:44.440 +00:00] [ERROR] [grpcclient/client.go:278] ["ClientBase Call grpc first call get error"] [role=querynode] [error="err: rpc error: code = Canceled desc = context canceled\n, /go/src/github.com/milvus-io/milvus/internal/util/trace/stack_trace.go:51 github.com/milvus-io/milvus/internal/util/trace.StackTrace\n/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:277 github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).Call\n/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/client/client.go:254 github.com/milvus-io/milvus/internal/distributed/querynode/client.(*Client).Search\n/go/src/github.com/milvus-io/milvus/internal/proxy/task_search.go:485 github.com/milvus-io/milvus/internal/proxy.(*searchTask).searchShard\n/go/src/github.com/milvus-io/milvus/internal/proxy/task_policies.go:128 github.com/milvus-io/milvus/internal/proxy.mergeRoundRobinPolicy.func1\n/usr/local/go/src/runtime/asm_amd64.s:1571 runtime.goexit\n"] [stack="github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).Call\n\t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:278\ngithub.com/milvus-io/milvus/internal/distributed/querynode/client.(*Client).Search\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/client/client.go:254\ngithub.com/milvus-io/milvus/internal/proxy.(*searchTask).searchShard\n\t/go/src/github.com/milvus-io/milvus/internal/proxy/task_search.go:485\ngithub.com/milvus-io/milvus/internal/proxy.mergeRoundRobinPolicy.func1\n\t/go/src/github.com/milvus-io/milvus/internal/proxy/task_policies.go:128"]
/unassign

@sre-ci-robot sre-ci-robot assigned congqixia and unassigned yanliang567 Feb 8, 2023
@yanliang567 yanliang567 added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Feb 8, 2023
@yanliang567 yanliang567 added this to the 2.2.3 milestone Feb 8, 2023
@ponponon
Copy link
Author

ponponon commented Feb 8, 2023

/分配@congqixia 看到一个错误: [2023/02/08 10:42:44.440 +00:00] [ERROR] [grpcclient/client.go:278] ["ClientBase Call grpc first call get error"] [role=querynode] [error="err: rpc error: code = Canceled desc = context canceled\n, /go/src/github.com/milvus-io/milvus/internal/util/trace/stack_trace.go:51 github.com/milvus-io/milvus/internal/util/trace.StackTrace\n/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:277 github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).Call\n/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/client/client.go:254 github.com/milvus-io/milvus/internal/distributed/querynode/client.(*Client).Search\n/go/src/github.com/milvus-io/milvus/internal/proxy/task_search.go:485 github.com/milvus-io/milvus/internal/proxy.(*searchTask).searchShard\n/go/src/github.com/milvus-io/milvus/internal/proxy/task_policies.go:128 github.com/milvus-io/milvus/internal/proxy.mergeRoundRobinPolicy.func1\n/usr/local/go/src/runtime/asm_amd64.s:1571 runtime.goexit\n"] [stack="github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).Call\n\t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:278\ngithub.com/milvus-io/milvus/internal/distributed/querynode/client.(*Client).Search\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/client/client.go:254\ngithub.com/milvus-io/milvus/internal/proxy.(*searchTask).searchShard\n\t/go/src/github.com/milvus-io/milvus/internal/proxy/task_search.go:485\ngithub.com/milvus-io/milvus/internal/proxy.mergeRoundRobinPolicy.func1\n\t/go/src/github.com/milvus-io/milvus/internal/proxy/task_policies.go:128"] /unassign

这个问题是怎么导致的?并且如何解决?我重启 milvus server 无法解决这个问题,只能通过重新安装 milvus 来解决问题,但是遇到相同的查询依然会卡死,可以 100% 复现这个问题。(一次查询多个向量就能导致 milvus 卡死)

What caused this problem? And how can I fix it? I restarted the milvus server but could not solve the problem, I had to reinstall milvus to solve the problem, but the same query would still get stuck and I could reproduce the problem 100%. (Multiple vectors in one query can cause milvus to seize)

This is a very serious bug

@xiaofan-luan
Copy link
Contributor

image

@xiaofan-luan
Copy link
Contributor

@ponponon
there is a very large placeholder, that cause the search time out.
maybe your nq is too large, pls check

@xiaofan-luan
Copy link
Contributor

successfully placeholder is only 2060
image

@xiaofan-luan
Copy link
Contributor

/assign @ponponon

@xiaofan-luan xiaofan-luan added help wanted Extra attention is needed and removed kind/bug Issues or changes related a bug labels Feb 8, 2023
@ponponon
Copy link
Author

ponponon commented Feb 8, 2023

successfully placeholder is only 2060 image

你是指一次搜索的向量太多了?但是只要卡死,并且重启之后,只搜索一个向量,也是照样卡死的,永远无法获得结果,只能重装

You mean there are too many vectors to search at once? But as long as it gets stuck and after a reboot, only one vector is searched, it will still get stuck and never get results, only reload

@ponponon
Copy link
Author

ponponon commented Feb 8, 2023

nq

nq 是什么?我使用的是 pymilvus,这个 nq 和 pymilvus 里面哪个参数相关?

What is nq? I am using pymilvus, which parameter is related to this nq in pymilvus?

@yanliang567
Copy link
Contributor

nq means the number of vectors you are searching in one search request.

@ponponon
Copy link
Author

ponponon commented Feb 9, 2023

nq means the number of vectors you are searching in one search request.

那永久阻塞的问题如何解决呢?(即便之后使用一个向量进行查询也是永久阻塞)(除了重装)

And how to solve the problem of permanent blocking? (Even queries using a vector afterwards are permanently blocking) (except for reinstallation)

@ponponon
Copy link
Author

ponponon commented Feb 9, 2023

我现在可以确定:永久阻塞的问题和 nq 参数有关

之前,我是一次性搜索 8000+ 的向量,我现在按照你说的,改成小于 2060,用的是一次 1000 个向量搜索,没有出现『永久阻塞』

但是在另一个集合中,一次性搜索 8000+的向量是可以成功的(用时20秒+)

所以这个『永久阻塞』的问题和搜索向量数目有关系,但又不仅仅和向量数目有关系。

这两个集合的区别是:一个没有字符串字段,一次性搜索 8000+ 不会永久阻塞。另一个有字符串字段,一次性搜索 8000+ 会永久阻塞

并且永久阻塞不影响写操作,写依旧可以,但是查询请求永久阻塞,永远不可用

但是永久阻塞影响其他的集合,只要出现『永久阻塞』,所有集合的 search 操作都无法继续进行

I can now confirm that the permanent blocking problem is related to the nq parameter

Previously, I was searching 8000+ vectors at once, but now I've changed it to less than 2060, as you said, using 1000 vectors at a time, without 'permanent blocking'

But in another collection, a one-time search for 8000+ vectors was successful (in 20 seconds+)

So this 'permanent blocking' problem is related to the number of vectors searched, but not only to the number of vectors.

The difference between the two sets is that one has no string field and a one-time search of 8000+ does not block permanently. The other has a string field and a one-time search of 8000+ will block permanently

The permanent blocking does not affect writes, which are still possible, but the query request blocks permanently and is never available

However, permanent blocking affects other collections, as long as there is 'permanent blocking', all search operations on all collections will not continue

@ponponon ponponon changed the title [Bug]: milvus 搜索卡死(milvus Search Stuck) [Bug]: milvus 搜索永久阻塞(milvus Search permanent blocking) Feb 9, 2023
@ponponon
Copy link
Author

ponponon commented Feb 9, 2023

所以,如果触发了 bug 如何解决?

So, how do you resolve a bug if it is triggered?

@xiaofan-luan
Copy link
Contributor

所以,如果触发了 bug 如何解决?

So, how do you resolve a bug if it is triggered?

if nq too large, then search timeout is likely to timeout, and also OOM could happen.

if you can share you schema and your search request we could try to reproduce on our environment.

So far we know the search is timeout because the batch is too large

@xiaofan-luan
Copy link
Contributor

might be fixed by #21852 if you try to output a string field

@yanliang567
Copy link
Contributor

/assign @jiaoew1991
/unassign

@yanliang567 yanliang567 modified the milestones: 2.2.3, 2.2.4 Feb 14, 2023
@jiaoew1991 jiaoew1991 removed their assignment Feb 24, 2023
@stale
Copy link

stale bot commented Mar 17, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

@stale stale bot added the stale indicates no udpates for 30 days label Mar 17, 2023
@yanliang567 yanliang567 modified the milestones: 2.2.4, 2.2.5 Mar 21, 2023
@stale stale bot removed stale indicates no udpates for 30 days labels Mar 21, 2023
@yanliang567 yanliang567 modified the milestones: 2.2.5, 2.2.6 Apr 1, 2023
@yanliang567 yanliang567 modified the milestones: 2.2.6, 2.2.7 Apr 20, 2023
@yanliang567 yanliang567 modified the milestones: 2.2.7, 2.2.9 May 4, 2023
@yanliang567 yanliang567 modified the milestones: 2.2.9, 2.2.10 Jun 8, 2023
@yanliang567 yanliang567 modified the milestones: 2.2.10, 2.2.11 Jun 16, 2023
@yanliang567 yanliang567 modified the milestones: 2.2.11, 2.2.12 Jul 3, 2023
@stale
Copy link

stale bot commented Aug 2, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

@stale stale bot added the stale indicates no udpates for 30 days label Aug 2, 2023
@yanliang567 yanliang567 modified the milestones: 2.2.12, 2.2.13 Aug 4, 2023
@stale stale bot removed stale indicates no udpates for 30 days labels Aug 4, 2023
@stale
Copy link

stale bot commented Sep 3, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

@stale stale bot added the stale indicates no udpates for 30 days label Sep 3, 2023
@stale stale bot closed this as completed Sep 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug stale indicates no udpates for 30 days triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

5 participants