Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to use public discovery service #14447

Closed
gcool-info opened this issue Sep 11, 2022 · 8 comments
Closed

Unable to use public discovery service #14447

gcool-info opened this issue Sep 11, 2022 · 8 comments

Comments

@gcool-info
Copy link

What happened?

  • Bootstrapping a 3-node etcd cluster with public discovery service.
  • Each node is a separate service in a common docker-compose file
  • Getting the following error:
...
{"level":"warn","ts":"2022-09-11T13:04:32.496Z","caller":"v2discovery/discovery.go:234","msg":"failed to get from discovery server","discovery-url":"https://discovery.etcd.io","path":"/64cc6e62c503a2a6609ff7e76d9ea43a/_config/size","error":"client: etcd cluster is unavailable or misconfigured; error #0: x509: certificate signed by unknown authority\n","err-detail":"error #0: x509: certificate signed by unknown authority\n"}
{"level":"info","ts":"2022-09-11T13:04:32.496Z","caller":"v2discovery/discovery.go:297","msg":"retry connecting to discovery service","url":"https://discovery.etcd.io","reason":"cluster status check","backoff":"2s"}
...

What did you expect to happen?

Cluster to bootstrap successfully

How can we reproduce it (as minimally and precisely as possible)?

docker-compose.yaml:

version: '3.7'

services:
  etcd1:
    image: quay.io/coreos/etcd:v3.5.3
    environment:
      - ALLOW_NONE_AUTHENTICATION=yes
      - ETCD_NAME=etcd1
      - ETCD_INITIAL_ADVERTISE_PEER_URLS=http://etcd1:2380
      - ETCD_LISTEN_PEER_URLS=http://0.0.0.0:2380
      - ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2379
      - ETCD_ADVERTISE_CLIENT_URLS=http://etcd1:2379
      - ETCD_DISCOVERY=https://discovery.etcd.io/d6db9ed5ff85dac2466be83973194203
  etcd2:
    image: quay.io/coreos/etcd:v3.5.3
    environment:
      - ALLOW_NONE_AUTHENTICATION=yes
      - ETCD_NAME=etcd2
      - ETCD_INITIAL_ADVERTISE_PEER_URLS=http://etcd2:2380
      - ETCD_LISTEN_PEER_URLS=http://0.0.0.0:2380
      - ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2379
      - ETCD_ADVERTISE_CLIENT_URLS=http://etcd2:2379
      - ETCD_DISCOVERY=https://discovery.etcd.io/d6db9ed5ff85dac2466be83973194203
  etcd3:
    image: quay.io/coreos/etcd:v3.5.3
    environment:
      - ALLOW_NONE_AUTHENTICATION=yes
      - ETCD_NAME=etcd3
      - ETCD_INITIAL_ADVERTISE_PEER_URLS=http://etcd3:2380
      - ETCD_LISTEN_PEER_URLS=http://0.0.0.0:2380
      - ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2379
      - ETCD_ADVERTISE_CLIENT_URLS=http://etcd3:2379
      - ETCD_DISCOVERY=https://discovery.etcd.io/d6db9ed5ff85dac2466be83973194203

where ETCD_DISCOVERY was generated by running:

curl https://discovery.etcd.io/new?size=3

Anything else we need to know?

No response

Etcd version (please run commands below)

$ etcd --version
See docker-compose above

$ etcdctl version
See docker-compose above

Etcd configuration (command line flags or environment variables)

See docker-compose above

Etcd debug information (please run commands blow, feel free to obfuscate the IP address or FQDN in the output)

$ etcdctl member list -w table
# paste output here

$ etcdctl --endpoints=<member list> endpoint status -w table
# paste output here

Relevant log output

{"level":"info","ts":"2022-09-11T13:42:10.744Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_ADVERTISE_CLIENT_URLS","variable-value":"http://etcd1:2379"}
{"level":"info","ts":"2022-09-11T13:42:10.746Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_DISCOVERY","variable-value":"https://discovery.etcd.io/d6db9ed5ff85dac2466be83973194203"}
{"level":"info","ts":"2022-09-11T13:42:10.747Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_INITIAL_ADVERTISE_PEER_URLS","variable-value":"http://etcd1:2380"}
{"level":"info","ts":"2022-09-11T13:42:10.747Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_LISTEN_CLIENT_URLS","variable-value":"http://0.0.0.0:2379"}
{"level":"info","ts":"2022-09-11T13:42:10.747Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_LISTEN_PEER_URLS","variable-value":"http://0.0.0.0:2380"}
{"level":"info","ts":"2022-09-11T13:42:10.747Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_NAME","variable-value":"etcd1"}
{"level":"info","ts":"2022-09-11T13:42:10.748Z","caller":"etcdmain/etcd.go:73","msg":"Running: ","args":["/usr/local/bin/etcd"]}
{"level":"warn","ts":"2022-09-11T13:42:10.748Z","caller":"etcdmain/etcd.go:105","msg":"'data-dir' was empty; using default","data-dir":"etcd1.etcd"}
{"level":"info","ts":"2022-09-11T13:42:10.749Z","caller":"embed/etcd.go:131","msg":"configuring peer listeners","listen-peer-urls":["http://0.0.0.0:2380"]}
{"level":"info","ts":"2022-09-11T13:42:10.750Z","caller":"embed/etcd.go:139","msg":"configuring client listeners","listen-client-urls":["http://0.0.0.0:2379"]}
{"level":"info","ts":"2022-09-11T13:42:10.750Z","caller":"embed/etcd.go:308","msg":"starting an etcd server","etcd-version":"3.5.3","git-sha":"0452feec7","go-version":"go1.16.15","go-os":"linux","go-arch":"amd64","max-cpu-set":2,"max-cpu-available":2,"member-initialized":false,"name":"etcd1","data-dir":"etcd1.etcd","wal-dir":"","wal-dir-dedicated":"","member-dir":"etcd1.etcd/member","force-new-cluster":false,"heartbeat-interval":"100ms","election-timeout":"1s","initial-election-tick-advance":true,"snapshot-count":100000,"snapshot-catchup-entries":5000,"initial-advertise-peer-urls":["http://etcd1:2380"],"listen-peer-urls":["http://0.0.0.0:2380"],"advertise-client-urls":["http://etcd1:2379"],"listen-client-urls":["http://0.0.0.0:2379"],"listen-metrics-urls":[],"cors":["*"],"host-whitelist":["*"],"initial-cluster":"etcd1=http://etcd1:2380","initial-cluster-state":"new","initial-cluster-token":"https://discovery.etcd.io/d6db9ed5ff85dac2466be83973194203","quota-size-bytes":2147483648,"pre-vote":true,"initial-corrupt-check":false,"corrupt-check-time-interval":"0s","auto-compaction-mode":"periodic","auto-compaction-retention":"0s","auto-compaction-interval":"0s","discovery-url":"https://discovery.etcd.io/d6db9ed5ff85dac2466be83973194203","discovery-proxy":"","downgrade-check-interval":"5s"}
{"level":"info","ts":"2022-09-11T13:42:10.753Z","caller":"etcdserver/backend.go:81","msg":"opened backend db","path":"etcd1.etcd/member/snap/db","took":"1.628256ms"}
{"level":"info","ts":"2022-09-11T13:42:10.756Z","caller":"netutil/netutil.go:112","msg":"resolved URL Host","url":"http://etcd1:2380","host":"etcd1:2380","resolved-addr":"172.19.0.2:2380"}
{"level":"info","ts":"2022-09-11T13:42:10.759Z","caller":"netutil/netutil.go:112","msg":"resolved URL Host","url":"http://etcd1:2380","host":"etcd1:2380","resolved-addr":"172.19.0.2:2380"}
{"level":"warn","ts":"2022-09-11T13:42:11.099Z","caller":"v2discovery/discovery.go:234","msg":"failed to get from discovery server","discovery-url":"https://discovery.etcd.io","path":"/d6db9ed5ff85dac2466be83973194203/_config/size","error":"client: etcd cluster is unavailable or misconfigured; error #0: x509: certificate signed by unknown authority\n","err-detail":"error #0: x509: certificate signed by unknown authority\n"}
{"level":"info","ts":"2022-09-11T13:42:11.099Z","caller":"v2discovery/discovery.go:297","msg":"retry connecting to discovery service","url":"https://discovery.etcd.io","reason":"cluster status check","backoff":"2s"}
{"level":"warn","ts":"2022-09-11T13:42:13.442Z","caller":"v2discovery/discovery.go:234","msg":"failed to get from discovery server","discovery-url":"https://discovery.etcd.io","path":"/d6db9ed5ff85dac2466be83973194203/_config/size","error":"client: etcd cluster is unavailable or misconfigured; error #0: x509: certificate signed by unknown authority\n","err-detail":"error #0: x509: certificate signed by unknown authority\n"}
{"level":"info","ts":"2022-09-11T13:42:13.442Z","caller":"v2discovery/discovery.go:297","msg":"retry connecting to discovery service","url":"https://discovery.etcd.io","reason":"cluster status check","backoff":"4s"}
{"level":"warn","ts":"2022-09-11T13:42:17.795Z","caller":"v2discovery/discovery.go:234","msg":"failed to get from discovery server","discovery-url":"https://discovery.etcd.io","path":"/d6db9ed5ff85dac2466be83973194203/_config/size","error":"client: etcd cluster is unavailable or misconfigured; error #0: x509: certificate signed by unknown authority\n","err-detail":"error #0: x509: certificate signed by unknown authority\n"}
{"level":"info","ts":"2022-09-11T13:42:17.796Z","caller":"v2discovery/discovery.go:297","msg":"retry connecting to discovery service","url":"https://discovery.etcd.io","reason":"cluster status check","backoff":"8s"}
@gcool-info gcool-info changed the title Public etcd discovery service returning errors Public discovery service returning errors Sep 11, 2022
@gcool-info gcool-info changed the title Public discovery service returning errors Unable to use public discovery service Sep 11, 2022
@ahrtr
Copy link
Member

ahrtr commented Sep 11, 2022

Points:

  1. It isn't recommended to use v2discovery in production environment, and it's going to be replaced by v3discovery.
  2. I just implemented a simple demo, and confirmed that v2discovery is working well. Again, it isn't recommended to be used in production environment.
  3. The issue you raised is actually related to docker-debian-artifacts/issues/15. Note that since golang 1.16, CreateCertificate verifies the generated certificate's signature using the signer's public key. If the signature is invalid, an error is returned. Please also refer to v2discovery/main.go#L20-L26

@gcool-info
Copy link
Author

Great, thanks! Two quick follow-up questions:

  • Will v3discovery be recommended for use in a production environment?
  • Out of curiosity, why isn't v2discovery recommended? Is there anywhere I can read up on this?

@ahrtr
Copy link
Member

ahrtr commented Sep 12, 2022

  • Will v3discovery be recommended for use in a production environment?

I would say yes, but since it's a new feature in 3.6, so it might be buggy. Could you share the use case how will you use the v3discovery? Are you going to use a centralized etcd cluster as a discovery service, so as to bootstrap & manage all other etcd clusters?

  • Out of curiosity, why isn't v2discovery recommended? Is there anywhere I can read up on this?

v2discovery is based on V2 API, which has already been deprecated for years. V2 server has already been removed in the code base, V2 client API might be removed in next minor release (3.7). FYI. https://etcd.io/docs/v3.4/op-guide/v2-migration/

@gcool-info
Copy link
Author

Could you share the use case how will you use the v3discovery? Are you going to use a centralized etcd cluster as a discovery service, so as to bootstrap & manage all other etcd clusters?

Thanks for asking this. We're deploying etcd as an AWS Fargate service. The IPs of nodes are not known beforehand. We're exploring the following options:

So, with this ticket, I'm basically trying to understand if Public etcd discovery, using v3discovery would be a recommended setup.

@ahrtr
Copy link
Member

ahrtr commented Sep 13, 2022

Thanks for the info.

The IPs of nodes are not known beforehand.

Just to double confirm, do you mean each etcd member/node doesn't know the IPs of other etcd members/nodes in the same cluster beforehand? If yes, I am curious about the real scenario. Would you mind share more details?

@gcool-info
Copy link
Author

Yes, of course. More generally, the scenario is described in the discovery docs section:

In a number of cases, the IPs of the cluster peers may not be known ahead of time. This is common when utilizing cloud providers or when the network uses DHCP.

Talking specifics, our setup is:

  • 1 AWS Fargate service
  • spins up 3 etcd containers
  • each container is assigned a private IP address at startup

Note that for me to assign an address to each container is a pain and quite convoluted


FYI: I've managed to make this work with DNS discovery. I'm just curious now, more than anything, on the pros/cons of the two approaches :)

@ahrtr
Copy link
Member

ahrtr commented Sep 14, 2022

Thanks for the feedback.

FYI: I've managed to make this work with DNS discovery. I'm just curious now, more than anything, on the pros/cons of the two approaches :)

DNS discovery depends the dns service to return all other peers' URLs, so you need to make sure the dns service is correctly configured to translate _etcd-server-ssl._tcp.example.com or _etcd-server._tcp.example.com into correct SRV records.

v3discovery depends on a dedicate etcd cluster as a discovery service. Obviously the cons is that you need to deploy a separate/dedicate etcd cluster. The pros is

  • You (as a administrator) can list/view all the target clusters via the discovery etcd cluster using etcdctl or customized client tool.
  • The etcd community also supports this case.
  • Each member can automatically register itself. So you don't need to manually register each member's IP or FQDN in the discovery service.

@gcool-info
Copy link
Author

Great, thanks for all the help! Closing this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants