Skip to content
This repository has been archived by the owner on Aug 16, 2023. It is now read-only.

Table 'milvus_meta.collections' doesn't exist #420

Closed
xyfleet opened this issue Feb 13, 2023 · 14 comments
Closed

Table 'milvus_meta.collections' doesn't exist #420

xyfleet opened this issue Feb 13, 2023 · 14 comments

Comments

@xyfleet
Copy link

xyfleet commented Feb 13, 2023

Hi Team,

I met this error log below in the pod of rootcoord. As a result, the rootcoord pod is in standby mode.

2023/02/13 19:02:17 /go/src/github.com/milvus-io/milvus/internal/metastore/db/dao/collection.go:39 Error 1146: Table 'milvus_meta.collections' doesn't exist

But from the log, I can see the connection to the external mysql DB is good.

[2023/02/13 19:02:17.951 +00:00] [INFO] [dbcore/core.go:50] ["db connected success"] [host=xxx-yyy-xxx] [port=3306] [database=milvus_meta]

Then, all of the related pods, datanode pod, querycoord pod, proxy pod, indexcoord pod are all in WaitForComponentStates

[2023/02/13 20:03:43.413 +00:00] [WARN] [retry/retry.go:39] ["retry func failed"] ["retry time"=10] [error="WaitForComponentStates, not meet, RootCoord current state: StandBy"]

Background:
I am using an external mysql when I install this chart.
Below is part of my value.yaml

rootCoordinator:
  replicas: 2
  activeStandby:
    enabled: true

queryCoordinator:
  replicas: 2
  activeStandby:
    enabled: true
  
indexCoordinator:
  replicas: 2
  activeStandby:
    enabled: true

dataCoordinator:
  replicas: 2
  activeStandby:
    enabled: true

externalMysql:
  enabled: true
  username: xxxxx
  password: yyyyy
  address: "xxx.xxx.xxx"
  port: 3306
  dbName: milvus_meta

One more question here, I did not enable profiling in each component. Is this a problem?

Any help will be appreciated.

@xyfleet
Copy link
Author

xyfleet commented Feb 13, 2023

looks like the script failed to create table in the DB. https://github.com/milvus-io/milvus/blob/master/tests/scripts/values/mysql.yaml
The script above only works for internal mysqlDB?

@NicoYuan1986 could you please help take a look at this one?

@LoveEachDay
Copy link
Contributor

@xyfleet You need manually provision a mysql database with tables first.
You can use the sql statement from https://github.com/milvus-io/milvus/blob/master/tests/scripts/values/mysql.yaml.

@xyfleet
Copy link
Author

xyfleet commented Feb 14, 2023

@LoveEachDay Thank you so much. I added these tables manually. Then I met another issue.

In rootcood pod:

[2023/02/14 06:23:10.893 +00:00] [ERROR] [grpcclient/client.go:149] ["failed to get client address"] [error="find no available querycoord, check querycoord state"] ...
[2023/02/14 06:23:10.893 +00:00] [ERROR] [grpcclient/client.go:305] ["ClientBase ReCall grpc second call get error"] [role=querycoord] [error="err: find no available querycoord, check querycoord state
[2023/02/14 06:23:10.893 +00:00] [WARN] [rootcoord/quota_center.go:129] ["quotaCenter sync metrics failed"] [error="quotaCenter get Data cluster failed, err = DataCoord 171 is not ready"]

In DataCoord pod:

[2023/02/14 06:25:52.889 +00:00] [WARN] [datacoord/services.go:849] ["DataCoord.GetMetrics failed"] [traceID=4cdb1156d9ffea02] [nodeID=171] [req="{\"metric_type\":\"system_info\"}"] [error="DataCoord 171 is not ready"]

In querycood pod, proxy pod, datanode pod:

[2023/02/14 06:22:27.550 +00:00] [WARN] [retry/retry.go:39] ["retry func failed"] ["retry time"=0] [error="WaitForComponentStates, not meet, DataCoord current state: StandBy"]

In querycood pod:

[2023/02/14 06:25:26.150 +00:00] [ERROR] [querynode/query_node.go:271] ["QueryNode init vector storage failed"] [error="Endpoint url cannot have fully qualified paths."] [stack="github.com/milvus-io/milvus/internal/querynode.(*QueryNode).Init.func1\n\t/go/src/github.com/milvus-io/milvus/internal/querynode/query_node.go:271\nsync.(*Once).doSlow\n\t/usr/local/go/src/sync/once.go:68\nsync.(*Once).Do\n\t/usr/local/go/src/sync/once.go:59\ngithub.com/milvus-io/milvus/internal/querynode.(*QueryNode).Init\n\t/go/src/github.com/milvus-io/milvus/internal/querynode/query_node.go:249\ngithub.com/milvus-io/milvus/internal/distributed/querynode.(*Server).init\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/service.go:133\ngithub.com/milvus-io/milvus/internal/distributed/querynode.(*Server).Run\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/service.go:213\ngithub.com/milvus-io/milvus/cmd/components.(*QueryNode).Run\n\t/go/src/github.com/milvus-io/milvus/cmd/components/query_node.go:54\ngithub.com/milvus-io/milvus/cmd/roles.runComponent[...].func1\n\t/go/src/github.com/milvus-io/milvus/cmd/roles/roles.go:104"]
[2023/02/14 06:25:26.150 +00:00] [ERROR] [querynode/service.go:134] ["QueryNode init error: "] [error="Endpoint url cannot have fully qualified paths."] [stack="github.com/milvus-io/milvus/internal/distributed/querynode.(*Server).init\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/service.go:134\ngithub.com/milvus-io/milvus/internal/distributed/querynode.(*Server).Run\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/service.go:213\ngithub.com/milvus-io/milvus/cmd/components.(*QueryNode).Run\n\t/go/src/github.com/milvus-io/milvus/cmd/components/query_node.go:54\ngithub.com/milvus-io/milvus/cmd/roles.runComponent[...].func1\n\t/go/src/github.com/milvus-io/milvus/cmd/roles/roles.go:104"]
panic: Endpoint url cannot have fully qualified paths.

In the same time, I see some errors from etcd pod and zookeeper.

errors from etcd:

{"level":"warn","ts":"2023-02-14T05:59:31.783Z","caller":"etcdserver/util.go:123","msg":"failed to apply request","took":"8.867µs","request":"header:<ID:5789524982615469658 > lease_revoke:<id:5058864c270ae1ce>","response":"size:27","error":"lease not found"}

one zookeeper:

2023-02-13 19:00:58,522 [myid:2] - ERROR [LeaderConnector-milvus-db-zookeeper-2.milvus-db-zookeeper-headless.milvus.svc.cluster.local:2888:Learner$LeaderConnector@389] - Failed connect to milvus-db-zookeeper-2.milvus-db-zookeeper-headless.milvus.svc.cluster.local:2888
java.net.UnknownHostException: milvus-db-zookeeper-2.milvus-db-zookeeper-headless.milvus.svc.cluster.local

I think something is wrong with my etcd and kafka-zookeeper setting. Not sure if this is the reason why milvus pods still failed?

About the etcd and zookeeper errors, do you have any ideas? About the external s3 and kafka, please help check if there is an issue:

minio:
  enabled: false

etcd:
  persistence:
    storageClass: ebs-sc
    accessMode: ReadWriteOnce
    size: 10Gi

pulsar:
  enabled: false

kafka:
  enabled: true
  persistence:
    enabled: true
    storageClass: ebs-sc
    accessMode: ReadWriteOnce
    size: 300Gi
  metrics:
    ## Prometheus Kafka exporter: exposes complimentary metrics to JMX exporter
    kafka:
      enabled: true
    jmx:
      enabled: true
    serviceMonitor:
      enabled: true


externalS3:
  enabled: true
  host: "xxxx"
  port: "80"
  accessKey: "s3_access_key"
  secretkey: "s3_secret_key"
  bucketName: "s3_bucket_name"

@LoveEachDay
Copy link
Contributor

@xyfleet Could you use this script to export logs for all components?

@xyfleet
Copy link
Author

xyfleet commented Feb 14, 2023

@LoveEachDay Today, I reconfigured the kafka, zookeeper and etcd. Still get the same error. Logs attached. Thanks.

milvus-log.tar.gz

@LoveEachDay
Copy link
Contributor

@xyfleet From the provided log:

[2023/02/14 20:03:08.123 +00:00] [ERROR] [datacoord/server.go:406] ["chunk manager init failed"] [error="Endpoint url cannot have fully qualified paths."] [stack="github.com/milvus-io/milvus/internal/datacoord.(*Server).newChunkManagerFactory\n\t/go/src/github.com/milvus-io/milvus/internal/datacoord/server.go:406\ngithub.com/milvus-io/milvus/internal/datacoord.(*Server).initDataCoord\n\t/go/src/github.com/milvus-io/milvus/internal/datacoord/server.go:293\ngithub.com/milvus-io/milvus/internal/datacoord.(*Server).Init.func1\n\t/go/src/github.com/milvus-io/milvus/internal/datacoord/server.go:270\ngithub.com/milvus-io/milvus/internal/util/sessionutil.(*Session).ProcessActiveStandBy\n\t/go/src/github.com/milvus-io/milvus/internal/util/sessionutil/session_util.go:811\ngithub.com/milvus-io/milvus/internal/datacoord.(*Server).Register\n\t/go/src/github.com/milvus-io/milvus/internal/datacoord/server.go:224\ngithub.com/milvus-io/milvus/internal/distributed/datacoord.(*Server).start\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/datacoord/service.go:195\ngithub.com/milvus-io/milvus/internal/distributed/datacoord.(*Server).Run\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/datacoord/service.go:246\ngithub.com/milvus-io/milvus/cmd/components.(*DataCoord).Run\n\t/go/src/github.com/milvus-io/milvus/cmd/components/data_coord.go:49\ngithub.com/milvus-io/milvus/cmd/roles.runComponent[...].func1\n\t/go/src/github.com/milvus-io/milvus/cmd/roles/roles.go:104"]
[2023/02/14 20:03:08.123 +00:00] [ERROR] [datacoord/server.go:271] ["DataCoord init failed"] [error="Endpoint url cannot have fully qualified paths."] [stack="github.com/milvus-io/milvus/internal/datacoord.(*Server).Init.func1\n\t/go/src/github.com/milvus-io/milvus/internal/datacoord/server.go:271\ngithub.com/milvus-io/milvus/internal/util/sessionutil.(*Session).ProcessActiveStandBy\n\t/go/src/github.com/milvus-io/milvus/internal/util/sessionutil/session_util.go:811\ngithub.com/milvus-io/milvus/internal/datacoord.(*Server).Register\n\t/go/src/github.com/milvus-io/milvus/internal/datacoord/server.go:224\ngithub.com/milvus-io/milvus/internal/distributed/datacoord.(*Server).start\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/datacoord/service.go:195\ngithub.com/milvus-io/milvus/internal/distributed/datacoord.(*Server).Run\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/datacoord/service.go:246\ngithub.com/milvus-io/milvus/cmd/components.(*DataCoord).Run\n\t/go/src/github.com/milvus-io/milvus/cmd/components/data_coord.go:49\ngithub.com/milvus-io/milvus/cmd/roles.runComponent[...].func1\n\t/go/src/github.com/milvus-io/milvus/cmd/roles/roles.go:104"]
[2023/02/14 20:03:08.123 +00:00] [WARN] [datacoord/service.go:197] ["DataCoord register service failed"] [error="Endpoint url cannot have fully qualified paths."]

The config for externalS3.host is invalid, seems your host includes a path. Just provide a host instead.

@xyfleet
Copy link
Author

xyfleet commented Feb 15, 2023

@LoveEachDay Thanks a lot. The host, you mean something like this
"my_bucket.s3.region-code.amazonaws.com", right?

@haorenfsa
Copy link
Contributor

haorenfsa commented Feb 15, 2023

@xyfleet and also set externalS3.port to 443, externalS3.useSSL to true

@LoveEachDay
Copy link
Contributor

Yes

@LoveEachDay Thanks a lot. The host, you mean something like this "my_bucket.s3.region-code.amazonaws.com", right?

@xyfleet
Copy link
Author

xyfleet commented Feb 15, 2023

@LoveEachDay I tried several times, still get this error

[2023/02/15 05:23:18.973 +00:00] [WARN] [storage/minio_chunk_manager.go:106] ["failed to check blob bucket exist"] [bucket=my-bucket-name] [error="Access Denied."]

I tried two different s3 accounts but got the same error. I think my s3-user has right permission. Do you have any idea?

s3-user permission:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "s3:PutObjectACL",
                "s3:PutObject",
                "s3:ListBucket",
                "s3:GetObjectVersion",
                "s3:GetObjectACL",
                "s3:GetObject",
                "s3:DeleteObjectVersion",
                "s3:DeleteObject"
            ],
            "Effect": "Allow",
            "Resource": [
                "arn:aws:s3:::my-s3-bucket/*",
                "arn:aws:s3:::my-s3-bucket"
            ],
            "Sid": ""
        }
    ]
}

externals3 config:

externalS3:
  enabled: true
  host: "s3-bucket.s3.us-east-1.amazonaws.com"
  port: "443"
  accessKey: "externa_s3_access_key"
  secretkey: ""externa_s3_secret_key"
  bucketName: "my-s3-bucket"
  useSSL: true

@haorenfsa
Copy link
Contributor

haorenfsa commented Feb 15, 2023

@xyfleet Sry, my mistake. Milvus uses url path to access bucket, it doesn't support it by host. You can change your externalS3.host to s3.us-east-1.amazonaws.com

@xyfleet
Copy link
Author

xyfleet commented Feb 15, 2023

@haorenfsa Thanks for your update. I updated my code and still got error. Really weird.

externalS3:
  enabled: true
  host: s3.us-east-1.amazonaws.com
  port: "443"
  accessKey: "externa_s3_access_key"
  secretkey: ""externa_s3_secret_key"
  bucketName: "my-s3-bucket"
  useSSL: true
[2023/02/15 05:50:16.219 +00:00] [WARN] [storage/minio_chunk_manager.go:106] ["failed to check blob bucket exist"] [bucket=my-s3-bucket] [error="Access Denied."]

I found this python code, https://github.com/milvus-io/milvus/blob/master/tests/benchmark/milvus_benchmark/update.py,
Not sure if this works as expected.

values_dict['minio']['enabled'] = True
    # values_dict["externalS3"]["enabled"] = True
    values_dict["externalS3"]["enabled"] = False
    values_dict["externalS3"]["host"] = config.MINIO_HOST
    values_dict["externalS3"]["port"] = config.MINIO_PORT
    values_dict["externalS3"]["accessKey"] = config.MINIO_ACCESS_KEY
    values_dict["externalS3"]["secretKey"] = config.MINIO_SECRET_KEY
    values_dict["externalS3"]["bucketName"] = config.MINIO_BUCKET_NAME
    logging.debug(values_dict["externalS3"])

@xyfleet
Copy link
Author

xyfleet commented Feb 15, 2023

@haorenfsa Do you think we can have a quick zoom meeting to troubleshoot this one?

@LoveEachDay
Copy link
Contributor

@xyfleet Could you join the slack channel?

@haorenfsa Do you think we can have a quick zoom meeting to troubleshoot this one?

@xyfleet xyfleet closed this as completed Feb 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants